PacificBiosciences / kineticsTools

Tools for detecting DNA modifications from single molecule, real-time sequencing data
21 stars 21 forks source link

Chemistry cannot be identified---cannot perform kinetic analysis #68

Closed bioxu closed 4 years ago

bioxu commented 5 years ago

Hi, everyone

I am trying to extract the IPD values from pbalign result and the command is:

$ pbalign subreads.bam ref.fasta pbaligned2ref.bam --nproc 12

and the ipdSummary command is:

$ ipdSummary pbaligned2ref.bam --reference ref.fasta --identify m6A,m4C,m5C_TET --gff myVariants.gff --csv kinetics.csv --methylFraction

However, the error is exists: Chemistry cannot be identified---cannot perform kinetic analysis 2019-06-18 06:37:52,811 [ERROR] Chemistry cannot be identified---cannot perform kinetic analysis Chemistry cannot be identified---cannot perform kinetic analysis Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/pbcommand/cli/core.py", line 136, in _pacbio_main_runner return_code = exe_main_func(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/kineticsTools-0.6.1-py2.7-linux-x86_64.egg/kineticsTools/ipdSummary.py", line 699, in args_runner return kt.start() File "/usr/lib64/python2.7/site-packages/kineticsTools-0.6.1-py2.7-linux-x86_64.egg/kineticsTools/ipdSummary.py", line 412, in start return self.run() File "/usr/lib64/python2.7/site-packages/kineticsTools-0.6.1-py2.7-linux-x86_64.egg/kineticsTools/ipdSummary.py", line 475, in run ret = self._mainLoop() File "/usr/lib64/python2.7/site-packages/kineticsTools-0.6.1-py2.7-linux-x86_64.egg/kineticsTools/ipdSummary.py", line 629, in _mainLoop self.args.paramsPath) File "/usr/lib64/python2.7/site-packages/kineticsTools-0.6.1-py2.7-linux-x86_64.egg/kineticsTools/internal/basic.py", line 26, in getIpdModelFilename raise Exception(msg) Exception: Chemistry cannot be identified---cannot perform kinetic analysis

What should I do? Thanks for your help.

bioxu commented 5 years ago

I followed the commands suggested by others, but the error was exists, too.

blasr --bam --out baxxxxreads.bam --nproc 16 ./baxxxreads.bam ./ref.fasta
samtools sort -o baxxreads.sorted.bam baxxxxreads.bam
samtools index baxxreads.sorted.bam
pbindex baxxreads.sorted.bam
ipdSummary baxxreads.sorted.bam --reference ref.fasta --identify m6A,m4C,m5C_TET --numWorkers 16 --gff basemod.gff

rhallPB commented 5 years ago

How old is the data? The current recommendation for recent data would be to use pbmm2 for the alignment https://github.com/PacificBiosciences/pbmm2. You would also likely have to install the SMRT Analysis package to have the recent chemistry models.

AntonS-bio commented 4 years ago

Had similar problem only in my case exception was triggered in the loader.py. The message was the same "Exception: Chemistry cannot be identified---cannot perform kinetic analysis"

I think this is due to input meta data or bam not specifying the chemistry used. I've bypassed this by going into the file with exception, in my case loader.py which had

if majorityChem == 'unknown':
        msg = "Chemistry cannot be identified---cannot perform kinetic analysis"
        LOG.error(msg)
        raise Exception(msg)

but since I new the chemistry used, I've changed to

if majorityChem == 'unknown':
        majorityChem = "P6-C4"
        #msg = "Chemistry cannot be identified---cannot perform kinetic analysis"
        #LOG.error(msg)
        #raise Exception(msg)

Everything worked after that.

camilogarciabotero commented 4 years ago

Hi @MyProgramWorks

Your solution worked for me as well,

Thanks!

natechols commented 4 years ago

The error message doesn't necessarily mean the chemistry info is missing, it is more likely that you are using data that hasn't been tested with this version of kineticsTools. Using P6-C4 chemistry will indeed run, but we haven't tested on chemistries beyond the ones supported here and we can't promise that the results will be accurate.

camilogarciabotero commented 4 years ago

Thanks Nate for your precision.

I do have PacBio RSII chem data, so where can I find the exact chems the program supports?

natechols commented 4 years ago

If you have RSII data you are much better off downloading SMRT Analysis 2.3 instead - we still make it available here (scroll to bottom and click "Previous Releases of SMRT Analysis for PacBio RS II"): https://www.pacb.com/support/software-downloads/ I don't recommend trying to use the latest versions of our tools unless you have relatively recent data.