ml-jku / vnegnn

MIT License
23 stars 4 forks source link

issues of process_dataset.py #4

Open fangjingmu opened 2 months ago

fangjingmu commented 2 months ago

what is the structure of the training set? (vnegnn) mou@trailblazer:~/dnapr/vnegnn$ python process_dataset.py --input_path data/scpdb/scpdb_subset_puresnet/raw/ --output_path data/scpdb/scpdb_subset_puresnet/raw/ --n_jobs 8 0it [00:00, ?it/s][Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers. datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4dya_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4c7k_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3fg2_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3rl3_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3qvx_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1j7u_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1bu5_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1y6b_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2y74_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2fj1_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2g27_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3ggp_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3gfd_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1rl4_3 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1lw0_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4nqg_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2qs1_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4lja_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2dwb_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3rbq_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3vw9_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2zbx_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3jyp_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2fle_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2rl5_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2bbw_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3ufw_3 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4a51_7 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2ylq_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/

I think this is right (vnegnn) mou@c05b11n03:~/dnapr/vnegnn/data/scpdb/scpdb_subset_puresnet/raw/1a2b_1$ find . | sed -e 's;[^/]*/;|;g;s;|; |;g' . |__ints_M.mol2 |__site.mol2 |protein.mol2 |ligand.sdf |cavity6.mol2 |ligand.mol2 |cavityALL.mol2 |IFP.txt

but Traceback (most recent call last): File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 80, in main() File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 65, in main result = pmap_multi(process_chunk, chunks, n_jobs=n_jobs, data_path=input_path, device=device) File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/src/utils/common.py", line 57, in pmap_multi results = Parallel(n_jobs=n_jobs, verbose=verbose, timeout=None)( File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 2007, in call return output if self.return_generator else list(output) File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1650, in _get_outputs yield from self._retrieve() File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1754, in _retrieve self._raise_error_fast() File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1789, in _raise_error_fast error_job.get_result(self.timeout) File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 745, in get_result return self._return_or_raise() File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 763, in _return_or_raise raise self._result NotADirectoryError: [Errno 20] Not a directory: 'data/scpdb/scpdb_subset_puresnet/raw/1a2b_1/ints_M.mol2/protein.pdb'

fangjingmu commented 2 months ago

Have you changed all of the .mol2 into .pdb?