Merck / deepbgc

BGC Detection and Classification Using Deep Learning
https://doi.org/10.1093/nar/gkz654
MIT License
123 stars 27 forks source link

DeepBGC failed with ValueError: missing molecule_type in annotations #67

Open chengyou96 opened 2 years ago

chengyou96 commented 2 years ago

Hi,

When I tried to use DeepBGC to analyze a genbank file downloaded from NCBI, I encountered an errror like this:

Using TensorFlow backend. WARNING 11/02 15:54:49 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING 11/02 15:54:49 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING 11/02 15:54:49 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING 11/02 15:54:49 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING 11/02 15:54:49 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use rate instead of keep_prob. Rate should be set to rate = 1 - keep_prob. WARNING 11/02 15:54:50 From /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

INFO 11/02 15:54:50 Loading model from: /Users/cynthiayo/Library/Application Support/deepbgc/data/0.1.0/classifier/product_class.pkl /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk. UserWarning) /opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator RandomForestClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk. UserWarning) INFO 11/02 15:54:52 Loading model from: /Users/cynthiayo/Library/Application Support/deepbgc/data/0.1.0/classifier/product_activity.pkl INFO 11/02 15:54:52 Processing input file 1/1: sequence.gb INFO 11/02 15:54:52 ================================================================================ INFO 11/02 15:54:52 Processing record #1: KB946332.1 INFO 11/02 15:54:52 Sequence already contains 802 CDS features, skipping CDS detection INFO 11/02 15:54:52 Detecting Pfam domains in "KB946332.1" using HMMER hmmscan, this might take a while... INFO 11/02 15:59:29 HMMER hmmscan Pfam detection done in 0h4m37s INFO 11/02 16:00:07 Added 1795 Pfam domains (947 unique PFAM_IDs) INFO 11/02 16:00:07 Detecting BGCs using deepbgc model in KB946332.1 INFO 11/02 16:00:10 Detected 3 BGCs using deepbgc model in KB946332.1 INFO 11/02 16:00:10 Classifying 3 BGCs using product_class model in KB946332.1 INFO 11/02 16:00:10 Classifying 3 BGCs using product_activity model in KB946332.1 INFO 11/02 16:00:10 Saving processed record KB946332.1 ERROR 11/02 16:00:10 missing molecule_type in annotations Traceback (most recent call last): File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/main.py", line 113, in main run(argv) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/main.py", line 102, in run args.func.run(**args_dict) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/command/pipeline.py", line 181, in run writer.write(record) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/output/bgc_genbank.py", line 26, in write SeqIO.write(cluster_record, self.fd, 'genbank') File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/Bio/SeqIO/init.py", line 530, in write count = writer_class(handle).write_file(sequences) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/Bio/SeqIO/Interfaces.py", line 244, in write_file count = self.write_records(records, maxcount) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/Bio/SeqIO/Interfaces.py", line 218, in write_records self.write_record(record) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/Bio/SeqIO/InsdcIO.py", line 981, in write_record self._write_the_first_line(record) File "/opt/anaconda3/envs/deepbgc/lib/python3.7/site-packages/Bio/SeqIO/InsdcIO.py", line 744, in _write_the_first_line raise ValueError("missing molecule_type in annotations") ValueError: missing molecule_type in annotations ERROR 11/02 16:00:10 ================================================================================ ERROR 11/02 16:00:10 DeepBGC failed with ValueError: missing molecule_type in annotations ERROR 11/02 16:00:10 ================================================================================

Could you help me with the problem mentioned here?

Thank you!

prihoda commented 2 years ago

Hi @chengyou96. Can you share the output of deepbgc info?

chengyou96 commented 2 years ago

Sure. Here it is:

INFO 22/03 16:06:09 Loading model from: /Users/cynthiayo/Library/Application Support/deepbgc/data/0.1.0/detector/deepbgc.pkl

Caiyulu-818 commented 1 year ago

Did you have solved this problem yet?

KarenGoncalves commented 7 months ago

having the same issue here. here is the output of deepbgc info

 _____                  ____    ____   ____
 |  _ \  ___  ___ ____ | __ )  / ___) / ___)
 | | \ \/ _ \/ _ \  _ \|  _ \ | |  _ | |
 | |_/ /  __/  __/ |_) | |_) || |_| || |___
 |____/ \___|\___| ___/|____/  \____| \____)
=================|_|===== version 0.1.18 =====
INFO    29/01 14:08:32   Available data files: ['Pfam-A.31.0.hmm.h3m', 'Pfam-A.31.0.hmm', 'Pfam-A.31.0.hmm.h3i', 'Pfam-A.31.0.hmm.h3p', 'Pfam-A.31.0.hmm.h3f', 'Pfam-A.31.0.clans.tsv']
INFO    29/01 14:08:32   ================================================================================
INFO    29/01 14:08:32   Available detectors: ['clusterfinder_original', 'clusterfinder_geneborder', 'deepbgc', 'clusterfinder_retrained']
INFO    29/01 14:08:32   --------------------------------------------------------------------------------
INFO    29/01 14:08:32   Model: clusterfinder_original
INFO    29/01 14:08:32   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_original.pkl
INFO    29/01 14:08:33   Type: ClusterFinderHMM
INFO    29/01 14:08:33   Version: 0.1.0
INFO    29/01 14:08:33   Timestamp: 1551449904.5101252 (2019-03-01T09:18:24.510125)
INFO    29/01 14:08:33   --------------------------------------------------------------------------------
INFO    29/01 14:08:33   Model: clusterfinder_geneborder
INFO    29/01 14:08:33   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_geneborder.pkl
INFO    29/01 14:08:33   Type: GeneBorderHMM
INFO    29/01 14:08:33   Version: 0.1.0
INFO    29/01 14:08:33   Timestamp: 1551449863.1564941 (2019-03-01T09:17:43.156494)
INFO    29/01 14:08:33   --------------------------------------------------------------------------------
INFO    29/01 14:08:33   Model: deepbgc
INFO    29/01 14:08:33   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/detector/deepbgc.pkl
Using TensorFlow backend.
WARNING 29/01 14:08:44   From /home/karencgs/deepbcg/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3733: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING 29/01 14:08:44   From /home/karencgs/deepbcg/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING 29/01 14:08:44   From /home/karencgs/deepbcg/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING 29/01 14:08:45   From /home/karencgs/deepbcg/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

INFO    29/01 14:08:45   Type: KerasRNN
INFO    29/01 14:08:45   Version: 0.1.0
INFO    29/01 14:08:45   Timestamp: 1551305667.986168 (2019-02-27T17:14:27.986168)
INFO    29/01 14:08:45   --------------------------------------------------------------------------------
INFO    29/01 14:08:45   Model: clusterfinder_retrained
INFO    29/01 14:08:45   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_retrained.pkl
INFO    29/01 14:08:45   Type: DiscreteHMM
INFO    29/01 14:08:45   Version: 0.1.0
INFO    29/01 14:08:45   Timestamp: 1551449925.734045 (2019-03-01T09:18:45.734045)
INFO    29/01 14:08:45   ================================================================================
INFO    29/01 14:08:45   Available classifiers: ['product_class', 'product_activity']
INFO    29/01 14:08:45   --------------------------------------------------------------------------------
INFO    29/01 14:08:45   Model: product_class
INFO    29/01 14:08:45   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/classifier/product_class.pkl
/home/karencgs/deepbcg/lib/python3.7/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.ensemble.forest module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.ensemble. Anything that cannot be imported from sklearn.ensemble is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/karencgs/deepbcg/lib/python3.7/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.tree.tree module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.tree. Anything that cannot be imported from sklearn.tree is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/karencgs/deepbcg/lib/python3.7/site-packages/sklearn/base.py:334: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.18.2 when using version 0.23.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/home/karencgs/deepbcg/lib/python3.7/site-packages/sklearn/base.py:334: UserWarning: Trying to unpickle estimator RandomForestClassifier from version 0.18.2 when using version 0.23.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
INFO    29/01 14:08:45   Type: RandomForestClassifier
INFO    29/01 14:08:45   Version: 0.1.0
INFO    29/01 14:08:45   Timestamp: 1551781410.019103 (2019-03-05T05:23:30.019103)
INFO    29/01 14:08:45   --------------------------------------------------------------------------------
INFO    29/01 14:08:45   Model: product_activity
INFO    29/01 14:08:45   Loading model from: /home/karencgs/.local/share/deepbgc/data/0.1.0/classifier/product_activity.pkl
INFO    29/01 14:08:45   Type: RandomForestClassifier
INFO    29/01 14:08:45   Version: 0.1.0
INFO    29/01 14:08:45   Timestamp: 1551781433.886473 (2019-03-05T05:23:53.886473)
INFO    29/01 14:08:45   ================================================================================
INFO    29/01 14:08:45   All OK