al-mcintyre / mCaller

A python program to call methylation (m6A in DNA) from nanopore signal data
MIT License
45 stars 16 forks source link

No module named 'sklearn.neural_network.multilayer_perceptron' #29

Open Shians opened 3 years ago

Shians commented 3 years ago

Under the latest version of sklearn (0.24.2) it appears that the MLP modules shifted around a bit. As a result under the latest version, I got the following error when trying to run mCaller.

ModuleNotFoundError: No module named 'sklearn.neural_network.multilayer_perceptron'

Since you don't import this directly in the Python code, I think it comes from the pickle file like this.

My solution for this was to downgrade sklearn to 0.22, so maybe that should be specified as a requirement or otherwise a higher version should be specified with an updated pickle.

al-mcintyre commented 3 years ago

That's a pain. Thanks for letting me know. If you try one of the new models I added ("CAAY..." or "CRAA...") does the error still occur?

Shians commented 3 years ago

The new models don't seem to work with the older 0.22 sklearn. In the new version they give the following error

mCaller.py -m GATC -r e_coli.fa -d ~/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/CAAYNNNNNRTAC_model_6_m6A.pkl -e events.tsv -f fastq/pcr_ecoli.fastq.gz -b A

1 contigs
1 threads
Error: could not find sequence for reference contig contig
4a2c2f99-da0d-4d53-96f3-1573d778d0ae    782 ACCGGMTCGAT 0.8,-0.195,-1.1199999999999999,1.28,-1.99,-0.9225,18.939263322884013    - - Index or Key Error
dict_keys(['general']) dict_keys(['MG', 'MC', 'MA', 'MT', 'MM', 'MH', 'AT', 'AC', 'AG', 'AA', 'AM']) MT
'MH'
Traceback (most recent call last):
  File "/home/shians/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/extract_contexts.py", line 199, in extract_features
    mod_prob = model[twobase_model].predict_proba([diffs]) #TODO: call model only when batch ready to write
KeyError: 'MH'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 187, in <module>
    main()
  File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 184, in main
    args.train,modelfile,args.skip_thresh,args.qual_thresh,args.classifier,training_tsv,args.plot_training)
  File "/home/shians/anaconda3/envs/mcaller/bin/mCaller.py", line 58, in distribute_threads
    extract_features(tsvname,refname,read2qual,nvariables,skip_thresh,qual_thresh,modelfile,classifier,0,endline=bytesize,train=train,pos_label=training_pos_dict,base=base,motif=motif,positions_list=positions_list) 
  File "/home/shians/anaconda3/envs/mcaller/lib/python3.6/site-packages/mCaller/extract_contexts.py", line 222, in extract_features
    print(model[twobase_model].predict_proba([diffs]))
KeyError: 'MH'
al-mcintyre commented 3 years ago

I need to fix this, but for now, please change "twobase = True" to "twobase = False" in extract_contexts.py (line 134 in my version)

dasn588 commented 1 year ago

Hi, I tried using mCaller tool and encountered given error: ModuleNotFoundError: No module named 'sklearn.neural_network.multilayer_perceptron'.

Next, I tried given solution: change "twobase = True" to "twobase = False" in extract_contexts.py (line 130 in my version), if type(model) != dict: model = {'general':model} #for compatibility with previously trained model twobase = False else: twobase = True base_model = base_models(base,twobase))

but still the error could not be resolved. Please help me in getting the issue resolved.

Thanks Nihar

al-mcintyre commented 1 year ago

Hi Nihar, Which version of scikit learn do you have installed? From the previous poster: "My solution for this was to downgrade sklearn to 0.22, so maybe that should be specified as a requirement or otherwise a higher version should be specified with an updated pickle." (in addition to the twobase = False change)

dasn588 commented 1 year ago

Hi, Thanks for your response. We have installed scikit learn version 0.24 and 0.22 as well. We also installed updated pickle version 5. But getting still same error. Command used: mCaller.py -p positions.txt -r Reference.fasta -d mCaller/r95_twobase_model_NN_6_m6A.pkl -e sample_eventalign.tsv -f sample.fastq -b A

al-mcintyre commented 1 year ago

Hi, it looks like the correct version is still not accessible. Are you using a package manager? If you run python in a terminal then import sklearn as skl print(skl.__version__) what does it show?

dasn588 commented 1 year ago

Hi, Thanks for your suggestions. Finally I was able to generate some output using -m option. I have 2 queries now. 1) how to generate the positions,txt file which is one of the input file for 6ma methylation detection. 2) how to interpret these output columns generated using -m option by mCaller? J02459.1 b49cc8bb-d98c-4f34-bfac-e7396b1f3dd0 3019 AGACGMTCTGG 0.07,2.64,1.78,6.32,12.53,7.1,7.424780316344464 + m6A 0.94

dasn588 commented 1 year ago

Hi, Can you please provide me with the output column descriptions and also let me know how to generate positions.txt file for 6ma analysis.

Thanks Nihar

On Wednesday, November 30, 2022, al-mcintyre @.***> wrote:

Hi, it looks like the correct version is still not accessible. Are you using a package manager? If you run python in a terminal then import sklearn as skl print(skl.version) what does it show?

— Reply to this email directly, view it on GitHub https://github.com/al-mcintyre/mCaller/issues/29#issuecomment-1331973876, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3B6RF5BV7YSX76JST2XF5DWK4XTHANCNFSM47V5VSKA . You are receiving this because you commented.Message ID: @.***>

al-mcintyre commented 1 year ago

Hi Nihar, Please see the README file. Positions file: "file with a list of positions at which to classify bases (must be formatted as space- or tab-separated file with chromosome, position, strand, and label if training)" (this is optional, you can also try predicting based on motifs by using the -m option instead) Output columns: "This returns a tabbed file with per-read predictions, where columns indicate chromosome, read name, genomic position, position k-mer context, features, strand, label, and probability of methylation predicted by mCaller for that position and read".

dasn588 commented 1 year ago

Hi, Can you please answer my query regarding availability of next updated version for mCaller tool?

Thanks Nihar

On Friday, December 2, 2022, al-mcintyre @.***> wrote:

Hi Nihar, Please see the README file. Positions file: "file with a list of positions at which to classify bases (must be formatted as space- or tab-separated file with chromosome, position, strand, and label if training)" (this is optional, you can also try predicting based on motifs by using the -m option instead) Output columns: "This returns a tabbed file with per-read predictions, where columns indicate chromosome, read name, genomic position, position k-mer context, features, strand, label, and probability of methylation predicted by mCaller for that position and read".

— Reply to this email directly, view it on GitHub https://github.com/al-mcintyre/mCaller/issues/29#issuecomment-1334920346, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3B6RF2WQG43SKKQRJQ3L6DWLGZZTANCNFSM47V5VSKA . You are receiving this because you commented.Message ID: @.***>

al-mcintyre commented 1 year ago

Hi Nihar, Unfortunately, there are no current plans for an updated version as everyone involved has moved on to other projects.

zyz-2000 commented 1 year ago

Hi, I tried using mCaller tool and encountered some error:

ModuleNotFoundError: No module named 'sklearn.neural_network.multilayer_perceptron

Then I downgrade sklearn to 0.22 and change "twobase = True" to "twobase = False" in extract_contexts.py (line 130 in my version). But there are still problems:

`Traceback (most recent call last): File "/public/home/yizhou/met/mcaller/mCaller-master/extract_contexts.py", line 199, in extract_features mod_prob = model[twobase_model].predict_proba([diffs]) #TODO: call model only when batch ready to write KeyError: 'general'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/public/home/yizhou/miniconda3/envs/mcaller/bin/mCaller.py", line 187, in main() File "/public/home/yizhou/miniconda3/envs/mcaller/bin/mCaller.py", line 183, in main distribute_threads(args.positions,args.motif,args.tsv,read2qual,args.reference,num_refs,base,mod,args.threads,args.num_variables, File "/public/home/yizhou/miniconda3/envs/mcaller/bin/mCaller.py", line 58, in distribute_threads extract_features(tsvname,refname,read2qual,nvariables,skip_thresh,qual_thresh,modelfile,classifier,0,endline=bytesize,train=train,pos_label=training_pos_dict,base=base,motif=motif,positions_list=positions_list) File "/public/home/yizhou/met/mcaller/mCaller-master/extract_contexts.py", line 222, in extract_features print(model[twobase_model].predict_proba([diffs])) KeyError: 'general'`

I don't know how to solve this error. I hope someone kind can help me.

Thank you