novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
108 stars 31 forks source link

EpiNano output issue #148

Open TTT16 opened 3 months ago

TTT16 commented 3 months ago

We are using the scripts which you published on this paper and repeat the m6A calling using EpiNano, with the datasets published on Nature Communication (2019) including Curlcakes and yeast datasets.

EpiNano: Detection of m6A RNA Modifications Using Oxford Nanopore Direct RNA Sequencing

· First Online: 04 June 2021

We are unable to get 2 output files: sample.per.site.var.csv and sample.per.site.5mer.csv

I think the Epinano_Variants.py was running since it could produce the analysis time, however, we got NO output files.

For yeast dataset: Epinano_Variants.py \

-c 36 \ -r $ref \ -b $folder_name.sorted.bam analysis took 52.780353307724 seconds

For curlcakes dataset: Epinano_Variants.py -c 36 -r $ref -b $folder_name.sorted.bam analysis took 5.046196460723877 seconds

Could you please help with this issue? Thank you, Trinh

enovoa commented 3 months ago

Hi can you please try with the demo data? https://github.com/novoalab/EpiNano/tree/master/test_data/make_predictions Thank you

SabeenRaza75 commented 3 months ago

I've tried to use the demo data from (https://github.com/novoalab/EpiNano/tree/master/test_data/make_predictions) and the commands from run.sh which requires Epinano_Variants.py. I've downloaded and installed version 1.2.4 but it still gives me an error:

Commads: python /apps/epinano/1.2.4/Epinano_Variants.py \ -R /EpiNano_input-data/ref/ref.fa \ -b /EpiNano_input-data/wt_data/wt.bam \ -n 6 \ -T t \ -s java -jar /sam2tsv/2.0/share/jvarkit-sam2tsv-1.0-0/sam2tsv.jar ##############

It gave an error:

################## usage: Epinano_Variants.py [-h] -b BAM -r REFERENCE [-c CPUS] Epinano_Variants.py: error: the following arguments are required: -r/--reference

###############

I corrected the -R to -r and re-ran the commands as mentioned in run.sh

################################################ -r /EpiNano_input-data/ref/ref.fa \ -b /EpiNano_input-data/wt_data/wt.bam \ -n 6 \ -T t \ -s java -jar /sam2tsv/2.0/share/jvarkit-sam2tsv-1.0-0/sam2tsv.jar

################

It says unrecognized arguments in the Epinano_Variants.py

################# usage: Epinano_Variants.py [-h] -b BAM -r REFERENCE [-c CPUS] Epinano_Variants.py: error: unrecognized arguments: -n 6 -T t -s java -jar /apps/sam2tsv/2.0/share/jvarkit-sam2tsv-1.0-0/sam2tsv.jar

enovoa commented 3 months ago

Please try removing the argument: "-s java -jar /apps/sam2tsv/2.0/share/jvarkit-sam2tsv-1.0-0/sam2tsv.jar" I believe that this is a mistake from a previous version

SabeenRaza75 commented 3 months ago

yes but the version 1.2.4 doesnt allow other flags such as -n, -T either - those are from the previous version too the only flags version 1.2.4 allows is: -c &-r & -b everything else it doesn't allow.. but when I run Epinano_Variants.py with just these three: Epinano_Variants.py \ -c 36 \ -r $ref \ -b $folder_name.sorted.bam

It doesn't give me anything - also do I use sample.sorted.bam file or JUST sample.bam ? analysis took 52.780353307724 seconds

enovoa commented 3 months ago

Hi, have you tried with the DEMO data with the right command line parameters as described in the README?

enovoa commented 3 months ago

you seem to be using parameters required for older versions of EpiNano

SabeenRaza75 commented 3 months ago

Hi, have you tried with the DEMO data with the right command line parameters as described in the README?

I'll try that today.

SabeenRaza75 commented 3 months ago

Hi, have you tried with the DEMO data with the right command line parameters as described in the README?

I'm trying to run Epinano_Predict.py using the demo data as follows: python $EPINANO_HOME/Epinano_Predict.py \ --model $EPINANO_HOME/models/rrach.q3.mis3.del3.linear.dump \ --predict ko.per.site.csv \ --columns 7,9,11 \ --out_prefix ko_mod_prediction


This initially gave me an error saying sklearn is not installed so I installed it (version 0.24.2) in the $EPINANO_HOME//epinano1.2_venv/lib/python3.6/site-packages/ directory (I'm running Python version 3.6.8) Now when I run the above command it says :

Traceback (most recent call last): File "/epinano/1.2.4/Epinano_Predict.py", line 139, in loaded_model = pickle.load (open (m,'rb')) ModuleNotFoundError: No module named 'sklearn.svm.classes'


It seems that the model was pickled using a different version of scikit-learn ?

Please advise - also am I using the correct trained model ?

enovoa commented 3 months ago

Hi @SabeenRaza75 sorry it seems that you have problems with the versions installed and/or locations in which you have installed the dependencies, but I am unable to figure out why from the information given above. As a solution, I would suggest you to switch to using MasterOfPores, a Nextflow workflow that has EpiNano (and many other tools!) embedded in it. It does not require you to install any of the softwares nor their dependencies, and is also a great solution to ensure reproduciblity, traceability and monitoring of your computational resources used. Let me know if this solution worked for you. Thanks!

Huanle commented 3 months ago

Sorry for this late chime in. @TTT16 , you might need scikit-learn 0.20.2 as indicated here.

Huanle commented 3 months ago

yes but the version 1.2.4 doesnt allow other flags such as -n, -T either - those are from the previous version too the only flags version 1.2.4 allows is: -c &-r & -b everything else it doesn't allow.. but when I run Epinano_Variants.py with just these three: Epinano_Variants.py -c 36 -r $ref -b $folder_name.sorted.bam

It doesn't give me anything - also do I use sample.sorted.bam file or JUST sample.bam ? analysis took 52.780353307724 seconds

The latest Epinano_Variants.py does not rely on the 3rd party sam2tsv.jar program required by its predecessors. Please check its help message.