biostars / biostar-handbook

Issue tracker for the Biostar Handbook
55 stars 12 forks source link

clustering corona sequences to trace origin #128

Open KenSaville opened 3 years ago

KenSaville commented 3 years ago

The command

cat metadata.txt | grep Dec | grep complete | grep -v gapped | cut -f 1 > early.ids

returns no lines from the metadata.txt file

I believe it's because the dates are in numeric form

grepping 2019 may solve the problem, but wouldn't if there were sequences from other months in 2019

ialbert commented 3 years ago

There is a major problem with the entire book in that in subsequent months NCBI changed many of the formats, they themselves weren't sure what the most appropriate way to distribute data was. The concepts are valid, just the slight changes in the data make the code work differently.

The whole book will be rewritten in the next two months, using a new service by NCBI called datasets:

https://www.ncbi.nlm.nih.gov/datasets/

KenSaville commented 3 years ago

OK

I was able to work around it - just thought I'd point it out. I'll be using the handbook to teach a class starting in about a month.

I'm fine with running into issues here and there and then trying to figure it out, sharing the process with students.

That's half, if not more, of the battle.

Ken

On Mon, Sep 21, 2020 at 2:47 PM Istvan Albert notifications@github.com wrote:

There is a major problem with the entire book in that in subsequent months NCBI changed many of the formats, they themselves weren't sure what the most appropriate way to distribute data was. The concepts are valid, just the slight changes in the data make the code work differently.

The whole book will be rewritten in the next two months, using a new service by NCBI called datasets:

https://www.ncbi.nlm.nih.gov/datasets/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biostars/biostar-handbook-issues/issues/128#issuecomment-696301136, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6ZCVWU46PI7RXBJPTEVYTSG6NV7ANCNFSM4RUZOJLQ .

-- Ken Saville, PhD A.M. Chickering Professor Chair, Biology Department Albion College