galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.39k stars 999 forks source link

BAM files for large genomes and their .csi index #10909

Open FredericBGA opened 3 years ago

FredericBGA commented 3 years ago

Maybe we've lost the csi index of BAM with Galaxy 20_05 I don't understand, maybe this is a small issue because all the code seems to be here (galaxyproject/galaxy#9570)

When I create a BAM (with BWA for example, on wheat genome, so the BAM needs a .csi) I don't have the link to download the .csi index anymore (it was the case previously). And we use a lot the "display with IGV" feature. And it looks like that the .csi index is missing (IGV is complaining...).

two remarks:

I'm really keen to help, but I need maybe some help to start because right now I don't really understand.

mvdbeek commented 3 years ago

Can you upload a problematic bam file with just a few reads to usegalaxy.org and then publish that history?

FredericBGA commented 3 years ago

I've created this history: BAM

The .bai is the index that I've downloaded from my Galaxy history. The .csi is the index that I've created with samtools 1.9 or 1.11

I observe something I can't understand: the .csi has been created on my server, when I upload it to Galaxy and then launch a diffbetween the two files (the .csi and the .dat that should hold the .csi) it says: Binary files x and Y differ