Open esraagithub opened 2 years ago
Hi @esraagithub, your file sample1_deepbgc_prepare_result.tsv
contains the BGC samples, is that correct? This file will need to contain an in_cluster
column, which will have a value of 1
in all rows (in case the file only contains "positive" BGC samples). Your file should also contain a sequence_id
column which should contain an identifier of each BGC.
@prihoda Thank you for your response Yes this file sample1_deepbgc_prepare_result.tsv resulted from deepbgc prepare command. It actually contain sequence id column and in_cluster column but in_cluster column has 0 in all raws not 1 I don't know why it has only zero
Hi @esraagithub if that file contains just BGC samples, you can manually change the value to 1 in all rows
Thank you I will try it
في السبت، ٢١ مايو ٢٠٢٢ ٨:٥٥ م David Příhoda @.***> كتب:
Hi @esraagithub https://github.com/esraagithub if that file contains just BGC samples, you can manually change the value to 1 in all rows
— Reply to this email directly, view it on GitHub https://github.com/Merck/deepbgc/issues/73#issuecomment-1133751829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMY4XZRJBW4FFDWMWM7M7HDVLEWRXANCNFSM5V6XF5EA . You are receiving this because you were mentioned.Message ID: @.***>
thank you, i tried it and it worked well but i get another error in the next step "training the classifier"
raise ValueError('No overlap found between classes and samples. Classes should be indexed by sequence_id.') ValueError: No overlap found between classes and samples. Classes should be indexed by sequence_id. ERROR 23/05 23:53:18 ================================================================================ ERROR 23/05 23:53:18 DeepBGC failed with ValueError: No overlap found between classes and samples. Classes should be indexed by sequence_id. ERROR 23/05 23:53:18 ================================================================================
I have a "sequence id" column in sample file (which i got from deepbgc prepare ) but no overlaps between it and classes file. so what should i do in this case? is this means i can't proceed with training? @prihoda
hello i faced a problem during detector o my sample
here is the error message, it says i didn't use a negative dataset but actually i used one that is called GeneSwap_Negatives.pfam.tsv i think deepbgc can't see the negative dataset because of a error o my command. --help didn't tell where or how to put it
INFO 15/05 09:28:22 Loaded 41102 samples and 80777 domains from sample1_deepbgc_prepare_result.tsv INFO 15/05 09:28:28 Loaded 10128 samples and 706950 domains from GeneSwap_Negatives.pfam.tsv ERROR 15/05 09:28:33 Got target variable with only one value {0} in: ['sample1_deepbgc_prepare_result.tsv', 'GeneSwap_Negatives.pfam.tsv'] Traceback (most recent call last): File "/root/esraa/miniconda3/envs/deepbgcv0.1.29/lib/python3.7/site-packages/deepbgc/main.py", line 113, in main run(argv) File "/root/esraa/miniconda3/envs/deepbgcv0.1.29/lib/python3.7/site-packages/deepbgc/main.py", line 102, in run args.func.run(**args_dict) File "/root/esraa/miniconda3/envs/deepbgcv0.1.29/lib/python3.7/site-packages/deepbgc/command/train.py", line 60, in run train_samples, train_y = util.read_samples(inputs, target) File "/root/esraa/miniconda3/envs/deepbgcv0.1.29/lib/python3.7/site-packages/deepbgc/util.py", line 574, in read_samples 'Did you provide positive and negative samples?') ValueError: ("Got target variable with only one value {0} in: ['sample1_deepbgc_prepare_result.tsv', 'GeneSwap_Negatives.pfam.tsv']", 'At least two values are required to train a model. ', 'Did you provide positive and negative samples?') ERROR 15/05 09:28:33 ================================================================================ ERROR 15/05 09:28:33 DeepBGC failed with ValueError: Got target variable with only one value {0} in: ['sample1_deepbgc_prepare_result.tsv', 'GeneSwap_Negatives.pfam.tsv'] ERROR 15/05 09:28:33 ================================================================================ ERROR 15/05 09:28:33 At least two values are required to train a model. ERROR 15/05 09:28:33 Did you provide positive and negative samples? ERROR 15/05 09:28:33 ================================================================================
my cmmand:
deepbgc train --model templates/deepbgc.json --output MyDeepBGCDetector.pkl sample1_deepbgc_prepare_result.tsv GeneSwap_Negatives.pfam.tsv --config PFAM2VEC pfam2vec .csv -v ClusterFinder_Annotated_Contigs.full.gbk