biocore / American-Gut

American Gut open-access data and IPython notebooks
Other
108 stars 81 forks source link

Zeros truncated in AG data mapping file #130

Closed jwdebelius closed 9 years ago

jwdebelius commented 9 years ago

The barcode prefix is truncated from the AG_full.txt map and the AG.txt file.

wasade commented 9 years ago

Same with the even1/10k files. Very weird. Good catch, will issue PR tomorrow

wasade commented 9 years ago

I know what happened... used excel...

ElDeveloper commented 9 years ago

Just so others are aware, AG sample IDs are prone to be interpreted as integers, not only by Excel but also by Pandas, so just be careful.

wasade commented 9 years ago

yes, but we have things like "BLANK123.foo.1231" included

On Mon, Mar 9, 2015 at 11:23 AM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

Just so others are aware, AG sample IDs are prone to be interpreted as integers, not only by Excel but also by Pandas, so just be careful.

— Reply to this email directly or view it on GitHub https://github.com/biocore/American-Gut/issues/130#issuecomment-77900154 .

ElDeveloper commented 9 years ago

Right, AFAIK pandas doesn't require all samples to have the same type, it is kinda funny in that it infers the type as int even if there's a few values that cannot be cast.

On (Mar-09-15|10:26), Daniel McDonald wrote:

yes, but we have things like "BLANK123.foo.1231" included

On Mon, Mar 9, 2015 at 11:23 AM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

Just so others are aware, AG sample IDs are prone to be interpreted as integers, not only by Excel but also by Pandas, so just be careful.

— Reply to this email directly or view it on GitHub https://github.com/biocore/American-Gut/issues/130#issuecomment-77900154 .


Reply to this email directly or view it on GitHub: https://github.com/biocore/American-Gut/issues/130#issuecomment-77900635