Closed nbedelman closed 3 years ago
Are you sure it's stuck and not still working? Can you look at top
and free -g
? The 66% progress indicates that the bam files have been successfully concatenated and sorted, and now ipyrad is running a samtools index
on the concatenated bam. It's possible it's still working and just taking a while. It's also possible if the concatenated bam file is really huge that you are running out of RAM and it's bogging down swapping to disk. How much ram do you have on the box you're running on?
Thanks for the fast reply! It's on a cluster so I'm not sure how to run top, but I don't think space was the issue. I found the indexing line in clustmap_across.py and also added the -c flag there
cmd3 = [
ipyrad.bins.samtools,
"index", "-c", #edited to add -c NBE 2/1/2021
os.path.join(
self.data.dirs.across,
"{}.cat.sorted.bam".format(self.data.name)
),
]
which worked in getting it to finish the step! It's now on to step 7, currently writing conversions so hopefully I"m out of the woods. Do you think it would be worth it to make csi the default for indexing? According to googling, bai can only handle chromosomes up to 512Mb, but csi can deal with bigger ones. Or is there a better way to deal with this situation?
with the csi index, the step took 45 seconds
What version of ipyrad are you working with?
I have 0.9.58
I added some code to protect against this. It will try the .bai indexing and if it fails it'll fall back and try the .csi version. It's checked into the repo. If you feel like checking it out and testing it that would be cool.
awesome! I'll give it a try
Sorry for the slow reply, but it seems to have worked! I am running ipyrad with a new reference, which seems to have an error on step 5, but I'll open a new issue
+1
Hello, I am assembling clusters by aligning to a reference genome. I've done this before without issue, but this time I'm having some trouble.
This run did have an earlier problem, which is that the reference scaffolds were too long for a .bai index. That issue threw this error:
I was able to get around this by making a csi index instead of bai - I added the "-c" option to line 1935 of "mapping_reads" in clustmap.py
But maybe that somehow led to an inability to concatenate the bams? Can I just manually concatenate bams and move on to step 7?
Thanks, happy to supply more info!
Nate