dib-lab / 2020-ibd

Analysis of publicly available metagenomic sequencing data from humans with IBD.
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

fix ihmp download rule #3

Open taylorreiter opened 4 years ago

taylorreiter commented 4 years ago

Currently, iHMP data was copied into raw from disc, and the names were changed from *_R1.fastq.gz to _1.fastq.gz. The links to download the data were not included in the metadata spreadsheet.

In order to work with this snakefile as it is currently written, the iHMP download needs to output files that end with _1.fastq.gz. However using the public download links from ibdmdb.org, the data is first downloaded as a tar file where the R1 and R2 files are, and then when uncompressed, the files end with *_R1.fastq.gz. One way to circumvent this is to put the properly named files on osf via google drive, and provide their download links through the osf downloader. A better alternative is probably to re-write the snakefile to accommodate the ihmp file download and naming conventions.