biobricks-ai / biobricks-issues

A repo for consolidating issues
1 stars 2 forks source link

RefSeq #4

Closed jborden closed 2 years ago

jborden commented 2 years ago

https://ftp.ncbi.nlm.nih.gov/refseq/

jborden commented 2 years ago

The total size of this dataset is:

$ echo "du -hs ." | lftp ftp://ftp.ncbi.nlm.nih.gov/refseq/ 2>&1
6.3T    .

on ws2:

$ df -h
...
/dev/md0         11T  1.6T  9.3T  15% /mnt/raid

I don't think we would have enough room to even process this dataset

tomlue commented 2 years ago

yep, this one is too big for now :) goal for the future maybe