Closed sh4nth closed 2 years ago
I just got back from traveling and saw your comment on the other PR: https://github.com/microDM/MicFunPred/pull/5#issuecomment-1100039284
I'm happy to fix it, could you explain what you meant though?
All the existing files do not match the same format exactly 👇
==> micfunpreDefinitions/data/db/aquatic_16S_cp.txt <==
copy_number
alpha_proteobacterium_SCGC_AAA536-G10_(unscreened) 0
alpha_proteobacterium_SCGC_AAA536-K22_(unscreened) 1
==> micfunpreDefinitions/data/db/human_16S_cp.txt <==
copy_number
candidate division SR1 bacterium taxon 345 1OR1 (unscreened) 2
Jonquetella anthropi ADV 126, DSM 22815 3
==> micfunpreDefinitions/data/db/mammals_16S_cp.txt <==
copy_number
Actinopolymorpha alba DSM 45243 2
Allobaculum stercoricanis DSM 13633 7
==> micfunpreDefinitions/data/db/plants_16S_cp.txt <==
copy_number
Actinopolymorpha_alba_DSM_45243 2
Saccharibacter_floricola_DSM_15669 3
==> micfunpreDefinitions/data/db/rrnDB_16S_cp.tsv <==
name copy_number
Komagataeibacter europaeus 4.50
Komagataeibacter xylinus 4.60
Ah! I see you fixed this upstream already in https://github.com/microDM/MicFunPred/commit/a64fa02700ad2103f401752571fa177939592f95
Thanks!
This will match other copy number tables and allow pd.concat to work correctly