Closed yonghyun09 closed 1 year ago
@yonghyun09,
This is because you used the pseudocount=True
option, which adds a pseudocount of 1 to every bacteria to every sample so that the feature table doesn't have any zeros. This option is useful when your intention is to plot the y-axis in log scale (i.e. log of 0 is undefined). In your case, the y-axis is not log, so there is no need to add pseudocount. Below is what happens when you remove the option:
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
taxa_names = ['Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Salmonella',]
hue_order = [
'Sample-1', 'Sample-2', 'Sample-3', 'Sample-4', 'Sample-5', 'Sample-6', 'Sample-7', 'Sample-8', 'Sample-9',
'Sample-10', 'Sample-11', 'Sample-12', 'Sample-13', 'Sample-14', 'Sample-15', 'Sample-16', 'Sample-17', 'Sample-18',
'Sample-19', 'Sample-20', 'Sample-21', 'Sample-22', 'Sample-23', 'Sample-24', 'Sample-25', 'Sample-26', 'Sample-27',
'Sample-28'
]
qzv_file = 'taxa-bar-plots.qzv'
fig, ax = plt.subplots(figsize=(15, 10))
dokdo.taxa_abundance_box_plot(
qzv_file,
level=6,
taxa_names=taxa_names,
pretty_taxa=True,
show_others=False,
hue='sample-id',
hue_order=hue_order,
ax=ax,
)
plt.tight_layout()
plt.savefig('test.png')
@sbslee
Thank you very much. It was right under one's nose, but I did not recognize it. 😂 Thank you for your kind explanation!
@sbslee
Hello Steven Lee,
Thank you for your kind answers to my persistent questions! Your answers are very helpful for visualization analysis. I have a question regarding taxa box plot visualization.
Of the total of 28 samples I analyzed by NGS, only 4 samples were classified as Salmonella. So, it was observed in only 4 samples in the 'taxa bar plot'.
However, during the process of marking with a 'box plot', as shown below, i observed that boxes were marked at a little more than 0% of the relative abundance standard in all other samples. As for other species, there was a problem that all samples were marked on the box plot even though they existed only in some samples. In this regard, is there a way to adjust the box so that only the existing samples appear?
I referred to the Parameters of the API docs, but it was difficult, so I would appreciate it if you could tell me which code to use.
Thank you for all your assistance.