what is the structure of the training set?
(vnegnn) mou@trailblazer:~/dnapr/vnegnn$ python process_dataset.py --input_path data/scpdb/scpdb_subset_puresnet/raw/ --output_path data/scpdb/scpdb_subset_puresnet/raw/ --n_jobs 8
0it [00:00, ?it/s][Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 4dya_2
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 4c7k_4
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3fg2_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3rl3_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3qvx_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 1j7u_2
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 1bu5_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 1y6b_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2y74_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2fj1_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2g27_2
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3ggp_4
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3gfd_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 1rl4_3
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 1lw0_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 4nqg_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2qs1_2
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 4lja_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2dwb_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3rbq_4
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3vw9_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2zbx_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3jyp_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2fle_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2rl5_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2bbw_2
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 3ufw_3
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 4a51_7
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
File not found in: 2ylq_1
datapath:data/scpdb/scpdb_subset_puresnet/raw/
Processing: data/scpdb/scpdb_subset_puresnet/raw/
I think this is right
(vnegnn) mou@c05b11n03:~/dnapr/vnegnn/data/scpdb/scpdb_subset_puresnet/raw/1a2b_1$ find . | sed -e 's;[^/]*/;|;g;s;|; |;g'
.
|__ints_M.mol2
|__site.mol2
|protein.mol2
|ligand.sdf
|cavity6.mol2
|ligand.mol2
|cavityALL.mol2
|IFP.txt
but
Traceback (most recent call last):
File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 80, in
main()
File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 65, in main
result = pmap_multi(process_chunk, chunks, n_jobs=n_jobs, data_path=input_path, device=device)
File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/src/utils/common.py", line 57, in pmap_multi
results = Parallel(n_jobs=n_jobs, verbose=verbose, timeout=None)(
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 2007, in call
return output if self.return_generator else list(output)
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1650, in _get_outputs
yield from self._retrieve()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1754, in _retrieve
self._raise_error_fast()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1789, in _raise_error_fast
error_job.get_result(self.timeout)
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 745, in get_result
return self._return_or_raise()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 763, in _return_or_raise
raise self._result
NotADirectoryError: [Errno 20] Not a directory: 'data/scpdb/scpdb_subset_puresnet/raw/1a2b_1/ints_M.mol2/protein.pdb'
what is the structure of the training set? (vnegnn) mou@trailblazer:~/dnapr/vnegnn$ python process_dataset.py --input_path data/scpdb/scpdb_subset_puresnet/raw/ --output_path data/scpdb/scpdb_subset_puresnet/raw/ --n_jobs 8 0it [00:00, ?it/s][Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers. datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4dya_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4c7k_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3fg2_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3rl3_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3qvx_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1j7u_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1bu5_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1y6b_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2y74_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2fj1_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2g27_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3ggp_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3gfd_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1rl4_3 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 1lw0_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4nqg_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2qs1_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4lja_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2dwb_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3rbq_4 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3vw9_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2zbx_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3jyp_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2fle_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2rl5_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2bbw_2 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 3ufw_3 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 4a51_7 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/ File not found in: 2ylq_1 datapath:data/scpdb/scpdb_subset_puresnet/raw/ Processing: data/scpdb/scpdb_subset_puresnet/raw/
I think this is right (vnegnn) mou@c05b11n03:~/dnapr/vnegnn/data/scpdb/scpdb_subset_puresnet/raw/1a2b_1$ find . | sed -e 's;[^/]*/;|;g;s;|; |;g' . |__ints_M.mol2 |__site.mol2 |protein.mol2 |ligand.sdf |cavity6.mol2 |ligand.mol2 |cavityALL.mol2 |IFP.txt
but Traceback (most recent call last): File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 80, in
main()
File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/process_dataset.py", line 65, in main
result = pmap_multi(process_chunk, chunks, n_jobs=n_jobs, data_path=input_path, device=device)
File "/lustre/grp/gyqlab/moufj/dnapr/vnegnn/src/utils/common.py", line 57, in pmap_multi
results = Parallel(n_jobs=n_jobs, verbose=verbose, timeout=None)(
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 2007, in call
return output if self.return_generator else list(output)
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1650, in _get_outputs
yield from self._retrieve()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1754, in _retrieve
self._raise_error_fast()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 1789, in _raise_error_fast
error_job.get_result(self.timeout)
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 745, in get_result
return self._return_or_raise()
File "/lustre/grp/gyqlab/moufj/anaconda3/envs/vnegnn/lib/python3.9/site-packages/joblib/parallel.py", line 763, in _return_or_raise
raise self._result
NotADirectoryError: [Errno 20] Not a directory: 'data/scpdb/scpdb_subset_puresnet/raw/1a2b_1/ints_M.mol2/protein.pdb'