Open E0287979 opened 9 months ago
Sorry for the late response on this-- complex expects files formatted as in https://github.com/samsledje/ConPLex/blob/main/tests/toy_predict.tsv, that is a tab-separated file with columns for the protein/molecule identifiers and then descriptions as sequences/SMILES strings. One common error with tab-separated files is, if written by hand, that the tab
character is actually four spaces, which isn't parsed properly. Make sure you're using a proper tab
/ \t
when creating this file.
Is there any specific format requirement for prediction tsv? I am able to run the predict function when I use the tsv within the repository.
I am getting error when I tried to predict on a file I have generated using surfaceome cayman as the backbone.
Traceback (most recent call last): File "/anaconda3/envs/conplex-dti/bin/conplex-dti", line 6, in
sys.exit(main())
File "/ConPLex/conplex_dti/main.py", line 41, in main
args.main_func(args)
File "/ConPLex/conplex_dti/cli/predict.py", line 104, in main
drug_featurizer.preload(query_df["moleculeSmiles"].unique())
File "/ConPLex/conplex_dti/featurizer/base.py", line 162, in preload
if seq in h5fi:
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/anaconda3/envs/conplex-dti/lib/python3.9/site-packages/h5py/_hl/group.py", line 514, in contains
return h5g._path_valid(self.id, self._e(name), self._lapl)
File "/anaconda3/envs/conplex-dti/lib/python3.9/site-packages/h5py/_hl/base.py", line 206, in _e
raise TypeError(f"A name should be string or bytes, not {type(name)}")
TypeError: A name should be string or bytes, not <class 'float'>
I think features returned NaN