Closed lauragails closed 1 year ago
Do you see any occurrence of Error
in the top level snakemake log? I think it's likely that the alignment jobs haven't started yet.
But I do see smrtcell_stats output (read length and quality summaries), which I didn't expect yet. The reason why I'm curious is in config.yaml it says that "stats" and "coverage_qc" require alignment. The output for this, and for jellyfish, seems to be working fine, so I know the bams are being read in OK.
It sounds like it just ran the smrtcell_stats and jellyfish jobs before alignment. Neither of these require alignment. If the workflow is still running, it's likely that alignments haven't been created yet.
I'll fix the comment for smrtcell_targets->stats
(#133).
A possibly related question is: Is it necessary to feed .bam and .bam.bai into the pipeline? I don't have the index files, but I can index them + resubmit. I now see how the aligned bam.bai might not be added to the targets loop in python.
Unaligned BAMs cannot be indexed as bai, and we don't require indices for any downstream steps. This error message from pysam/htslib should just be a warning instead, as it doesn't prevent the script from completing.
Perfect. No I don't see any evidence of an "Error" - willing to bet the pipeline just didn't get to alignments.
Makes sense RE unaligned bams. Thank you as always for the unbelievably fast responses!
Side note for posterity's sake: PacBio does generate an index for unaligned BAMs in the .pbi
format, but it isn't necessary for this workflow. Since we plan on acting on all reads and we don't care about the order in which we access them, indices are unnecessary.
The .pbi
index can be useful for some other PacBio tools, or for trying to access reads quickly from arbitrary ZMWs.
Hello, Happy to report that I'm chugging along! I am running
process_smrtcells
via process_smrtcells.local.shIt has been running overnight, and it's possible that all is well! But I am concerned that things will go awry because aligned directories haven't yet been created ie:
samples/{sample}/aligned/
.edit I only see the following directories in any {sample}/aligned:
benchmarks jellyfish logs smrtcell_stats
And I do see
smrtcell_stats
output (read length and quality summaries), which I didn't expect yet. The reason why I'm curious is inconfig.yaml
it says that "stats" and "coverage_qc" require alignment. The output for this, and for jellyfish, seems to be working fine, so I know the bams are being read in OK.A possibly related question is: Is it necessary to feed .bam
and
.bam.bai into the pipeline? I don't have the index files, but I can index them + resubmit. I now see how the aligned bam.bai might not be added to the targets loop in python.Here is an example of a logfile that is showing a warning about the missing index.
Thank you!