NVlabs / nvbio

NVBIO is a library of reusable components designed to accelerate bioinformatics applications using CUDA.
BSD 3-Clause "New" or "Revised" License
206 stars 50 forks source link

Upload testing files somewhere #25

Closed r-barnes closed 4 years ago

r-barnes commented 5 years ago

Running ./nvbio-test/nvbio-test gives the following errors:

warning : unable to open bwt "./data/human.NCBI36/Human.NCBI36.bwt"
error   :     failed opening file "./data/SRR493095_1.fastq.gz"

These files were likely too large to include in the repo directory, but prevent the test suite from completing. Is there a way to get them uploaded somewhere?

ps-account commented 5 years ago

Here's the fastq: https://www.ncbi.nlm.nih.gov/sra/SRX145461

foertter commented 4 years ago

Added documentation to README. Thanks!

r-barnes commented 4 years ago

@rdwrt: I wasn't able to get that file to work. Do you have a more specific link (this one brings you to a webpage from which there seem to be several files)?

jpantaleoni commented 4 years ago

you have to first build the .bwt from either a single fasta file or the complete list chromosomes, using nvBWT: https://nvlabs.github.io/nvbio/nvbwt_page.html

A single file for hg18 can be found here: ftp://genome-ftp.cse.ucsc.edu/goldenPath/hg18/bigZips/hg18.2bit

You have to use a utility twoBitToFa to convert it to fasta (though maybe we have also added support for reading directly from it, I don't recall).

Alternatively, you can find all chromosomes for hg18 here: http://hgdownload.cse.ucsc.edu/goldenpath/hg18/chromosomes/

Or you could use the top-level assembly for hg19 here: ftp://ftp.ensembl.org/pub/grch37/current/fasta/homo_sapiens/dna/

But I'd go for the .2bit file first, easiest and less error prone since it's already assembled.

On Thu, Feb 6, 2020 at 2:16 AM Richard Barnes notifications@github.com wrote:

@rdwrt https://github.com/rdwrt: I wasn't able to get that file to work. Do you have a more specific link (this one brings you to a webpage from which there seem to be several files)?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NVlabs/nvbio/issues/25?email_source=notifications&email_token=ABPEKMJZMCRFC3VSXLULZH3RBNQGXA5CNFSM4HL5YZH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK5SJNQ#issuecomment-582689974, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPEKMPUCSGOYRYUSLTBP2DRBNQGXANCNFSM4HL5YZHQ .