nanoporetech / pinfish

Tools to annotate genomes using long read transcriptomics data
Other
44 stars 13 forks source link

polish_clusters step #3

Closed callumparr closed 5 years ago

callumparr commented 5 years ago

I am little confused as what the bam input should be. The original BAM file used initial step to give gff conversion?

So far I ran following steps:

Converted BAM of splice aware minimap2 alignments to hg38 indexed reference to a gff file.

Clustered the gtf transcripts to the clustered_transcripts.gtf and generating the clusters.tsv

Now I am stuck here:

polish_clusters -a clusters.tsv -c 50 -o consensus_transcripts.fas -t 40 sorted.bam

Does this example code omit some step to pipe the results of the consensus to map back and generate another sorted BAM file?

Or the sorted.bam file is an example output file in addition to the fas reference file?

Sorry for my naivety.

I ran and get following error related to minimap2. I recently updated my install minimap2 to a new release and perhaps this may cause issue to find the executable.

tkx292:~ callum$ ./pinfish/polish_clusters/polish_clusters -a ~/Sync_later/clusters.tsv -c 50 -o ~/Sync_later/20180903_HDF_consensus_transcripts.fas ~/Sync_later/20180903_HDF_consensus.sorted.bam polish_clusters: 17:43:28 Failed running command: minimap2 -h - exit status 127

bsipos commented 5 years ago

Hi,

Yes, the BAM input for polish_clusters is the original file which got converted into GFF. Then you would indeed map back the polished output to the genome and again convert the resulting BAM into GFF. The pinfish analysis pipeline does all these things for you.

Regarding the minimap2 error: I think you are right and the update must have caused this issue. Please make sure that you have minimap2 in the path.

Best, Botond

callumparr commented 5 years ago

Turned out it was minimap2 was issue since updating, removing and installing new plus playing around with PATHS go it to work.

Initially I had reads already mapped with minimap2 so wanted to try pinfish alone but indeed the pipeline would be more efficient.

Thank you for your help again.