marbl / CHM13

The complete sequence of a human genome
Other
908 stars 98 forks source link

BAM files #10

Closed pkerpedjiev closed 4 years ago

pkerpedjiev commented 4 years ago

Hey, it's really amazing that you made all this data available! I was just curious if you happened to have alignments (BAM or CRAM) to the latest v1.0 assembly?

Rohit-Satyam commented 4 years ago

Also, are the annotations available for this assembly?

skoren commented 4 years ago

I just added bam files for the assemblies of the relevant techs (HiFi, CLR, ONT, Illumina).

As for annotations, I expect liftover annotations will be available soon through UCSC soon.

pkerpedjiev commented 4 years ago

Thanks a bunch! At the risk of pushing my luck, you wouldn't happen to have index (.bai) files as well would you?

skoren commented 4 years ago

We do have bai files but I usually prefer not to post easily-regenerated data/files to avoid cluttering up the downloads too much. The bai should be possible to regenerate on your side by running samtools index, no? Let me know if that doesn't work for some reason.

pkerpedjiev commented 4 years ago

That makes sense, but even though the .bai files are easy to generate the process does require downloading the entire BAM files. That's a lot of data transfer to generate a small index file.

One potential solution would be to store them without explicitly linking to them. For example, if the BAM file is stored at:

s3://bucket/my.bam

Then the index file would be at:

s3://bucket/my.bam.bai

When I first saw the bam files, I almost instinctively checked for the bai files at those locations. It wouldn't clutter the output and it would save a lot of data transfer.

What do you think? If that's not a feasible suggestion, then I can just generate them myself.

skoren commented 4 years ago

OK, I added the bai files with the same names as the bam just with the bai afterwards. So there is

https://s3.amazonaws.com/nanopore-human-wgs/chm13/assemblies/alignments/chm13.draft_v0.9.clr.bam

and corresponding

https://s3.amazonaws.com/nanopore-human-wgs/chm13/assemblies/alignments/chm13.draft_v0.9.clr.bam.bai
pkerpedjiev commented 4 years ago

Woo hoo! Thank you Sergey!! You saved me a lot of downloading.

And in case you're curious, here's the reason I was asking: pretty pileups from https://resgen.io/viewer/CM7MAF6nTymyK92-ZLQZTw

image

marcpaga commented 1 year ago

Would it be possible to upload the bam files for v2.0?

arangrhie commented 1 year ago

We have all the bam files from CHM13 reads on v1.1, which is v2.0 except the chrY (note CHM13 doesn't have the Y, the Y here is from HG002): https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=T2T/CHM13/assemblies/alignments/.