Closed mishkb closed 3 years ago
@mishkb,
Thanks for the great suggestion! I just added the pretty_taxa
argument to the heatmap
method in the 1.11.0-dev
branch (the official release of 1.11.0
will be made within 10 days; I aim to make a Dokdo release every month). When you set pretty_taxa=True
it will display only the smallest taxonomic rank like how it's done in the taxa_abundance_bar_plot
method with the legend_short
argument:
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))
qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
dokdo.heatmap(qza_file,
normalize='log10',
flip=True,
ax=ax1)
dokdo.heatmap(qza_file,
normalize='log10',
flip=True,
pretty_taxa=True,
ax=ax2)
plt.tight_layout()
plt.savefig('heatmap.png')
Is this what you had in your mind? Let me know if you need more help.
This is wonderful - thanks for your prompt response. Based on your figures, this is just what I was looking for. I'll give it a whirl today.
OK @sbslee ,I've been able to implement that new function - thank you! Now I'm wondering...
taxa_abundance_bar_plot
Michelle
@mishkb,
Thanks for the great ideas. I think I can implement both fairly easily, but let's take one step at a time. I just added the label_columns
parameter to the dokdo.heatmap
method. It works the same way as it does for the dokdo. taxa_abundance_bar_plot
method.
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))
qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'
dokdo.heatmap(qza_file,
metadata=metadata_file,
normalize='log10',
flip=True,
pretty_taxa=True,
ax=ax1)
dokdo.heatmap(qza_file,
metadata=metadata_file,
normalize='log10',
flip=True,
pretty_taxa=True,
label_columns=['sample-id', 'body-site'],
ax=ax2)
plt.tight_layout()
If you are happy with above, we can move on to the next (i.e. showing top taxa). Let me know what you think.
Works exactly as I was hoping - thanks @sbslee!
Does the heatmap function automatically align labels vertically or horizontally depending on the figure size/space available? (I think I have to look more at the plt.tight_layout()
function and matplotlib
.
@mishkb,
Glad to hear that works for you. As for the rotation of tick labels, AFAIK, it is automatically determined by the seaborn.heatmap
method (which is used by dokdo.heatmap
). BTW, you can easily change the rotation of tick labels in the following way:
ax = dokdo.heatmap(...)
for ticklabel in ax.get_xticklabels():
ticklabel.set_rotation(90)
It sounds like we are good to move on to the next task: showing only the top taxa. Stay tuned.
@sbslee I have just realised the 'prettified names" function actually truncates species names to just show the sname, not gGenus; s__species name. I just noticed this in the taxa_abundance_bar_plot
plot and obviously, it also is the same now in heatmaps if I use that option.
Is there a way to return the last two ranks - so in your example,
>>> a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
>>> print(dokdo.pname(a))
g__Actinomyces;s__Schaalia_radingae
I am using taxonomy from NCBI that separates the genus names and species names - so with the 'prettified names', it is impossible to tell which genus the species come from.
(BTW, I can make this a separate issue if that is better).
@mishkb,
No worries about opening another issue. And thanks for the great suggestion. I myself have come across that issue before. The good news is I just updated the pname
method to allow returning of more than one taxonomic rank:
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
dokdo.pname(a)
Will produce the usual:
's__Schaalia_radingae'
However, if you provide the levels
option:
a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
dokdo.pname(a, levels=[6, 7])
It will give what you want:
'g__Actinomyces;s__Schaalia_radingae'
I extended this to the heatmap
and taxa_abundance_bar_plot
methods by adding the pname_kws
argument to both methods:
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))
qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
dokdo.heatmap(qza_file,
normalize='log10',
flip=True,
ax=ax1)
dokdo.heatmap(qza_file,
normalize='log10',
flip=True,
pretty_taxa=True,
pname_kws=dict(levels=[2, 3]),
ax=ax2)
plt.tight_layout()
qzv_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/taxa-bar-plots.qzv'
dokdo.taxa_abundance_bar_plot(qzv_file,
figsize=(10, 7),
level=6,
count=8,
legend_short=True,
pname_kws=dict(levels=[5, 6]),
artist_kwargs=dict(show_legend=True,
legend_loc='upper left'))
plt.tight_layout()
Updated documentations:
pname: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#module-dokdo.api.pname heatmap: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#heatmap taxa_abundance_bar_plot: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#taxa-abundance-bar-plot
Does this work for you?
That looks great to me @sbslee - I hope yourself and others will find it useful as well. I really appreciate you making these adjustments and answering my questions.
@mishkb,
No problem! I also added the count
argument to the heatmap
method; it's similar to the count
argument in the taxa_abundance_bar_plot
method:
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))
qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'
dokdo.heatmap(qza_file,
metadata=metadata_file,
normalize='log10',
flip=True,
pretty_taxa=True,
ax=ax1)
dokdo.heatmap(qza_file,
metadata=metadata_file,
normalize='log10',
flip=True,
pretty_taxa=True,
count=20,
ax=ax2)
plt.tight_layout()
Let me know if this is what you wanted.
Fantastic @sbslee! This has worked perfectly with my data. Thank you again for adding those options to your code. I'm so grateful that I found Dokdo and that you are still actively working on it.
Hi @sbslee , Thanks for your very recent updates to heatmap (from clustermap) - just in time for what I need. Just wondering how I could incorporate your 'prettified names' function - to get taxa names as the y-axis labels such as is possible in the legend of the taxa_abundance_bar_plot output?
Thanks, Michelle