nanoporetech / pinfish

Tools to annotate genomes using long read transcriptomics data
Other
44 stars 13 forks source link

polish result seems quite strange #19

Closed ljw90607 closed 4 years ago

ljw90607 commented 4 years ago

Dear @bsipos

I have run the polish_cluster and collapse_partials with the clustered reads and observed somewhat unexpected results. In my understanding, the polish_cluster is intended to fix only the clustered reads. But I found that there were more reads in the polish_cluster result than the cluster_gff result. I found this quite strange and only difference I made for the run was the input bam file which is the bam file for gff generation, but only the primary reads were extracted (polish step did not work with the whole sorted bam file)

I ran the collapse step as well, but didn't seem like it has been polished. If you could give me any comment on this issue, I would really appreciate it.

Jungwoo

bsipos commented 4 years ago

Did you use the snakemake pipeline or run the tools manually? You are not sure you did not mix up the output files?

ljw90607 commented 4 years ago

Dear @bsipos

I have run the tool manually. Could this be due to non-matching data between bam and clustered gff file?

bsipos commented 4 years ago

Well, hard to say - but I recommend to use the pipeline and avoid manual runs!

ljw90607 commented 4 years ago

Thank you I will try the run again and share the result.

Jungwoo