dhimmel / fratjuice

Uncovering the microbes of fraternity basements
Creative Commons Zero v1.0 Universal
6 stars 2 forks source link

Update kits.tsv with uBiome mapping info #8

Closed dhimmel closed 6 years ago

dhimmel commented 6 years ago

Update kits.tsv with data from Mapping tab of ubiome-processed-outputs/results.xlsx, which was added in #7.

The python code to update kits.tsv (run with samples/ubiome as the working directory):

import pandas
renamer = {
    'Sample_type': 'sample_type',
    'SeqID': 'seq_id',
    'tubeId': 'tube_id',
}
ubiome_df = (
    pandas.read_excel('ubiome-processed-outputs/results.xlsx')
    .rename(columns=renamer)
    .drop(['Sample_ID', 'count', 'count_norm', 'tax_name', 'tax_rank'], axis='columns')
)
kits_df = pandas.read_table('kits.tsv')
kits_df['barcode'] = kits_df['kit'].str.replace('-', '').astype(int)
merged_df = kits_df.merge(ubiome_df)
merged_df = merged_df[['sample_id', 'sample_type', 'order_id', 'kit', 'tube', 'tube_id', 'seq_id']]
merged_df.to_csv('kits.tsv', sep='\t', index=False)