Open kcmtest opened 3 years ago
@kcmtest Did you find the way to get OTU count table ? I need that table also to get alpha and beta diversity. Thank you !
If someone still wonders: In the *_nanoclust_out.txt "reads_in_cluster" column is the column used for calculating relative abundances at the species level. It can be used for alpha/beta diversity and differential abundance tests if they require raw/absolute counts (LEfSe is fine with relative). To get it for lower ranks, one can add full taxonomy by columns and collapse counts.
Here is the function that was used by developers to calculate relative abundance:
def get_abundance_values(names,paths):
dfs = []
for name,path in zip(names,paths):
data = pd.read_csv(path, index_col=False, sep=';').iloc[:,1:]
total = sum(data['reads_in_cluster'])
rel_abundance=[]
for index,row in data.iterrows():
rel_abundance.append(row['reads_in_cluster'] / total)
data['rel_abundance'] = rel_abundance
dfs.append(pd.DataFrame({'taxid': data['taxid'], 'rel_abundance': rel_abundance}))
data.to_csv("" + name + "_nanoclust_out.txt")
return dfs
Previously i ran kraken2 where I would generate OTU table from various class and then perform differential OTU using deseq2 as it was raw counts.
How to do the same with the nanoclust output? It gives relative abundances.
Any suggestion how to go about this