Open Shians opened 3 years ago
That's a pain. Thanks for letting me know. If you try one of the new models I added ("CAAY..." or "CRAA...") does the error still occur?
The new models don't seem to work with the older 0.22 sklearn. In the new version they give the following error
mCaller.py -m GATC -r e_coli.fa -d ~/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/CAAYNNNNNRTAC_model_6_m6A.pkl -e events.tsv -f fastq/pcr_ecoli.fastq.gz -b A
1 contigs
1 threads
Error: could not find sequence for reference contig contig
4a2c2f99-da0d-4d53-96f3-1573d778d0ae 782 ACCGGMTCGAT 0.8,-0.195,-1.1199999999999999,1.28,-1.99,-0.9225,18.939263322884013 - - Index or Key Error
dict_keys(['general']) dict_keys(['MG', 'MC', 'MA', 'MT', 'MM', 'MH', 'AT', 'AC', 'AG', 'AA', 'AM']) MT
'MH'
Traceback (most recent call last):
File "/home/shians/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/extract_contexts.py", line 199, in extract_features
mod_prob = model[twobase_model].predict_proba([diffs]) #TODO: call model only when batch ready to write
KeyError: 'MH'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 187, in <module>
main()
File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 184, in main
args.train,modelfile,args.skip_thresh,args.qual_thresh,args.classifier,training_tsv,args.plot_training)
File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 58, in distribute_threads
extract_features(tsvname,refname,read2qual,nvariables,skip_thresh,qual_thresh,modelfile,classifier,0,endline=bytesize,train=train,pos_label=training_pos_dict,base=base,motif=motif,positions_list=positions_list)
File "/home/shians/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/extract_contexts.py", line 222, in extract_features
print(model[twobase_model].predict_proba([diffs]))
KeyError: 'MH'
I need to fix this, but for now, please change "twobase = True" to "twobase = False" in extract_contexts.py (line 134 in my version)
Hi, I tried using mCaller tool and encountered given error: ModuleNotFoundError: No module named 'sklearn.neural_network.multilayer_perceptron'.
Next, I tried given solution: change "twobase = True" to "twobase = False" in extract_contexts.py (line 130 in my version), if type(model) != dict: model = {'general':model} #for compatibility with previously trained model twobase = False else: twobase = True base_model = base_models(base,twobase))
but still the error could not be resolved. Please help me in getting the issue resolved.
Thanks Nihar
Hi Nihar,
Which version of scikit learn do you have installed? From the previous poster: "My solution for this was to downgrade sklearn to 0.22, so maybe that should be specified as a requirement or otherwise a higher version should be specified with an updated pickle."
(in addition to the twobase = False change)
Hi, Thanks for your response. We have installed scikit learn version 0.24 and 0.22 as well. We also installed updated pickle version 5. But getting still same error. Command used: mCaller.py -p positions.txt -r Reference.fasta -d mCaller/r95_twobase_model_NN_6_m6A.pkl -e sample_eventalign.tsv -f sample.fastq -b A
Hi, it looks like the correct version is still not accessible. Are you using a package manager?
If you run python in a terminal then import sklearn as skl
print(skl.__version__)
what does it show?
Hi, Thanks for your suggestions. Finally I was able to generate some output using -m option. I have 2 queries now. 1) how to generate the positions,txt file which is one of the input file for 6ma methylation detection. 2) how to interpret these output columns generated using -m option by mCaller? J02459.1 b49cc8bb-d98c-4f34-bfac-e7396b1f3dd0 3019 AGACGMTCTGG 0.07,2.64,1.78,6.32,12.53,7.1,7.424780316344464 + m6A 0.94
Hi, Can you please provide me with the output column descriptions and also let me know how to generate positions.txt file for 6ma analysis.
Thanks Nihar
On Wednesday, November 30, 2022, al-mcintyre @.***> wrote:
Hi, it looks like the correct version is still not accessible. Are you using a package manager? If you run python in a terminal then import sklearn as skl print(skl.version) what does it show?
— Reply to this email directly, view it on GitHub https://github.com/al-mcintyre/mCaller/issues/29#issuecomment-1331973876, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3B6RF5BV7YSX76JST2XF5DWK4XTHANCNFSM47V5VSKA . You are receiving this because you commented.Message ID: @.***>
Hi Nihar,
Please see the README file.
Positions file: "file with a list of positions at which to classify
bases (must be formatted as space- or tab-separated
file with chromosome, position, strand, and label if
training)" (this is optional, you can also try predicting based on motifs by using the -m
option instead)
Output columns: "This returns a tabbed file with per-read predictions, where columns indicate chromosome, read name, genomic position, position k-mer context, features, strand, label, and probability of methylation predicted by mCaller for that position and read".
Hi, Can you please answer my query regarding availability of next updated version for mCaller tool?
Thanks Nihar
On Friday, December 2, 2022, al-mcintyre @.***> wrote:
Hi Nihar, Please see the README file. Positions file: "file with a list of positions at which to classify bases (must be formatted as space- or tab-separated file with chromosome, position, strand, and label if training)" (this is optional, you can also try predicting based on motifs by using the -m option instead) Output columns: "This returns a tabbed file with per-read predictions, where columns indicate chromosome, read name, genomic position, position k-mer context, features, strand, label, and probability of methylation predicted by mCaller for that position and read".
— Reply to this email directly, view it on GitHub https://github.com/al-mcintyre/mCaller/issues/29#issuecomment-1334920346, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3B6RF2WQG43SKKQRJQ3L6DWLGZZTANCNFSM47V5VSKA . You are receiving this because you commented.Message ID: @.***>
Hi Nihar, Unfortunately, there are no current plans for an updated version as everyone involved has moved on to other projects.
Hi, I tried using mCaller tool and encountered some error:
ModuleNotFoundError: No module named 'sklearn.neural_network.multilayer_perceptron
Then I downgrade sklearn to 0.22 and change "twobase = True" to "twobase = False" in extract_contexts.py (line 130 in my version). But there are still problems:
`Traceback (most recent call last): File "/public/home/yizhou/met/mcaller/mCaller-master/extract_contexts.py", line 199, in extract_features mod_prob = model[twobase_model].predict_proba([diffs]) #TODO: call model only when batch ready to write KeyError: 'general'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/public/home/yizhou/miniconda3/envs/mcaller/bin/mCaller.py", line 187, in
I don't know how to solve this error. I hope someone kind can help me.
Thank you
Under the latest version of sklearn (0.24.2) it appears that the MLP modules shifted around a bit. As a result under the latest version, I got the following error when trying to run mCaller.
Since you don't import this directly in the Python code, I think it comes from the pickle file like this.
My solution for this was to downgrade sklearn to 0.22, so maybe that should be specified as a requirement or otherwise a higher version should be specified with an updated pickle.