nf-core / test-datasets

Test data to be used for automated testing with the nf-core pipelines
https://nf-co.re
MIT License
105 stars 353 forks source link

added pacbio longread file #1342

Closed tanyasarkjain closed 1 month ago

tanyasarkjain commented 1 month ago

want to add pacbio longread dna (with RG information) for testing with pbsv and other pacbio-based modules

tanyasarkjain commented 1 month ago

Thanks - test.sorted.data is aligned to the entire hg38.fa I believe and the 35 reads are spread out across the genome, so I opted instead to use publicly available puretagrget pacbio reads from https://downloads.pacbcloud.com/public/dataset/PureTargetRE/Coriell/PBMM2-BAM-Input-For-IGV-And-TRGT/ and downsample and created a new fasta genome3.fasta with the coordinates (chr19:45760000-45770300) of these reads, as the module I'm working on also requires a fasta input

fellen31 commented 1 month ago

Thanks - test.sorted.data is aligned to the entire hg38.fa I believe and the 35 reads are spread out across the genome

Yes, sorry. You are right, I forgot I needed it to be aligned to "real" coordinates of GRCh38 to use it as input for Paraphase.