oscarlr / IGenotyper

7 stars 3 forks source link

Inquiry About Assession Data Usage #19

Open HY6Wu opened 2 weeks ago

HY6Wu commented 2 weeks ago

I am very grateful for you creating such a useful tool. I am currently trying to use the data from your study for my research and encountered some difficulties. Specifically, I am unsure whether the data provided consists of CCS reads or subreads. I attempted to assemble the data by converting the downloaded SRA files into FASTA format and used hifiasm for assembly but couldn’t find any overlaps. I also tried converting the SRA files into BAM format and extracted CCS reads, yet the assembly still failed.

Could you kindly clarify whether the data in PRJNA555323 can be used for assembly, or if I might need to take a different approach?

Thank you again for your important work, and I appreciate any guidance you can provide.

oscarlr commented 1 week ago

Hi @HY6Wu ,

If the data is from PacBio RS II, then its subreads. If it’s from Sequel II/IIe, then it is CCS. It might be more helpful if you could provide the accession number.

HY6Wu commented 1 week ago

Thank you so much for your helpful response!

I am planning to use the NA19240 and NA12878 datasets from your paper, which, as it turns out, are subreads. However, the data downloaded from SRA appears to be in SRA format and cannot be directly converted into the CCS input file BAM format. I will make another attempt to convert the data.

data

Once again, thank you for your assistance.