DerrickWood / kraken

Kraken taxonomic sequence classification system
http://ccb.jhu.edu/software/kraken/
GNU General Public License v3.0
214 stars 103 forks source link

"kraken-build --download-library human" ends up with empty Human library #15

Closed nturaga closed 9 years ago

nturaga commented 9 years ago

H_sapiens library folder and size.

0B ./CHR_01 0B ./CHR_02 0B ./CHR_03 0B ./CHR_04 0B ./CHR_05 0B ./CHR_06 0B ./CHR_07 0B ./CHR_08 0B ./CHR_09 0B ./CHR_10 0B ./CHR_11 0B ./CHR_12 0B ./CHR_13 0B ./CHR_14 0B ./CHR_15 0B ./CHR_16 0B ./CHR_17 0B ./CHR_18 0B ./CHR_19 0B ./CHR_20 0B ./CHR_21 0B ./CHR_22 0B ./CHR_MT 0B ./CHR_Un 0B ./CHR_X 0B ./CHR_Y

nturaga commented 9 years ago

I believe there is an error in the regex in "kraken/scripts/download_genomic_library.sh"

nturaga commented 9 years ago

Oops! Sorry derrick. The previous version kraken -version 0.10.3-beta has the issue. Not your current version on github. Homebrew needs to update that version to what you have on git now.

Previous Version

The updated version downloads the required files, but I am unable to run kraken-build --build on it. After updating, this is the error I get.

ERROR

lmdu commented 9 years ago

I found this bug too. I have checked the download_genomic_library.sh script. There is a perl regular expression error in Line 103. You can change: file=$(perl -nle '/^-/ and /\b(hs_refGRCh\w+.fa.gz)\s$/ and print $1' .listing) to: file=$(perl -nle '/^-/ and /\b(hs_refGRCh\d+.\w+.fa.gz)\s$/ and print $1' .listing)

Hope this can help you

DerrickWood commented 9 years ago

@mencent This appears to be a different issue to the one Nitesh brought up, and is related to the new filenames for the GRCh38.p2 release. (Nitesh and I ended up handling that over private email.) In any event, I've patched the file with a similar regex that will hopefully be robust to future name changes (although I'm sure there's another curveball on the horizon).

Thanks, Derrick