novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
109 stars 31 forks source link

FYI for those trying to run (documentation additions) #128

Closed ArianeMora closed 1 year ago

ArianeMora commented 1 year ago

2 questions (a, b) and 2 suggestions (1, 2) :)

a. https://github.com/novoalab/EpiNano/tree/master/models Model: tr2pr1.sum_err.MODEL.rbf.model.dump: model trained with a combination of mis, ins, and del is missing from the models folder, can this be added back? b. what is the q3 column? (the columns in my output file from EpiNano variants are: #Ref,pos,base,strand,cov,q_mean,q_median,q_std,mis,ins,del

Just a couple of issues I ran into and how to solve them:

  1. If you get the .dict error, don't try using picard as the warning suggests, you can easily use samtools dict ref.fasta and this will create the dict for you (http://www.htslib.org/doc/samtools-dict.html)
  2. You must use python 3.6 (3.8 will throw errors).

:) might be helpful for others if this is added to the README! Thanks for the otherwise well documented start up guide! Note I was running on a macbook pro.

enovoa commented 1 year ago

Hi @ArianeMora, Thanks for your feedback. about a. where did you see this model that is "missing"? (i.e. in which version did you see it?); about b. q3 is the quality at position 3. Epinano generates outputs per 5mer and per position.

About your suggestions,

  1. Thanks we will try to add this suggestion to the README.
  2. The version 3.6 is listed in the list of software versions used. We did not test all possible versions (and future ones appearing later on), however, there is a Dockerfile that overcomes all possible issues with versions. Did you try it out? Thanks Eva
ArianeMora commented 1 year ago

Hi Eva, thanks for getting back to me so quickly! a. it's not on the master branch: https://github.com/novoalab/EpiNano/tree/master/models b. thank you!

  1. Yes it definitely works well with python 3.6! Was just suggesting to add it to at your requirements that python3.6 is preferred (I think there is a conflict with pandas in later vresions), this is the env I set up and it worked for me (might be helpful for others wanting to run EpiNano in a conda env): conda create --name epinano python=3.6 scikit-learn==0.20.2 cloudpickle==1.6.0 fsspec==0.3.3 toolz==0.11.1 pandas==0.25.1 dask==2.5.2 Thanks again!
Huanle commented 1 year ago

Hi @ArianeMora ,

a. tr2pr1.sum_err3.MODEL.rbf.model.dump is what you are looking for. The number 3 here just indicates where and how sum_err is combined.