PacificBiosciences / kineticsTools

Tools for detecting DNA modifications from single molecule, real-time sequencing data
19 stars 21 forks source link

KeyError: 'IPD' #15

Open eltonjrv opened 8 years ago

eltonjrv commented 8 years ago

Hello folks, I believe I am almost there attempting to run ipdSummary smoothly. Could you please check the attached log file out and give me some support to solve that.

Thanks a lot in advance, Best, Elton ipdSummary.err.txt

rhallPB commented 8 years ago

It looks like the ipd values have not been propagated from the bax.h5 files to the cmp.h5. Can you paste a copy of your pbalign command line?

It's likely you can just run: loadPulses <bax fofn> <cmpH5 file> -metrics DeletionQV,IP D,InsertionQV,PulseWidth,QualityValue,MergeQV,SubstitutionQV,DeletionTag -byread to correct the cmph5.

eltonjrv commented 8 years ago

Thanks for quickly replying, rhallPB! Here go my commands: $ pbalign --nproc 20 --forQuiver --byread --tmpDir tmp_pbalign input.fofn genome.fa baxh5-vs-genome.cmp.h5 $ ipdSummary.py --reference genome.fa --outfile baxh5-vs-genome.ipdSummary --numWorkers 20 -v baxh5-vs-genome.cmp.h5 >ipdSummary.log 2>ipdSummary.err &

Elton

rhallPB commented 8 years ago

Sorry I was editing, the --forQuiver option does not include the IPD, You can add the IPD to the pbalign command line. Or the simplest thing to do is post process with the loadPulses command in my comment above.

eltonjrv commented 8 years ago

Thanks again rhaiiPB, I am running smrtanalysis 2.3.0 with pbalign 0.2.0.138342. Unfortunately, I am not able to see any "IPD" option on the pbalign --help output. If I exclude the --forQuiver option on my command line, will the IPD be added by default? In the meantime I'll try the loadPulses command you posted above.

Thanks a lot

rhallPB commented 8 years ago

The option to pbalign is --metrics by default the--forQuiver option sets DeletionQV,DeletionTag,InsertionQV,MergeQV,SubstitutionQV, and loads the chemistry information. There are a number of ways to get to same result. I think the safest is to use --forQuiver in pbalign, then add the IPD using loadPulses. Otherwise you will also have to run loadChemistry.py

eltonjrv commented 8 years ago

Got it! Running loadPulses right now to add the IPD on the metrics, as you suggested. Optimistic to get ipdSummary working fine soon. Thanks very much, rhallPB. If any other problem comes up, I'll shout! Best, Elton