Open nitishnih opened 4 years ago
There're a couple of combinations of rows having lot of 0s and when calculating the correlation both of the numerator and denominator being 0 will result in NaN.
Thanks for the explanation @fbeghini. Does this mean hclust2 cannot process this merged abundance file created by metaphlan? Can this be handled in a different way by the program or do users need a manual workaround?
I'd filter out only the entries on the species level first, and then maybe try a different method for species/feature distance calculation
Hi there,
I'm having the exact same issue as nitishnih when trying to generate a heat map (using hclust2) from a merged abundance table generated by metaphlan3. I also followed the step for altering the abundance table by removing the header and the NCBI_tax_id column. I was just wondering if this issue had been fixed or resolved? Or is the recommended advice to use an alternative species/feature distance calculation method?
Kindly Respond to the issue on biobakery help forum regarding the same issue or please explain here itself that how to resolve the error when rows are having 0 values ? how to remove them from the merged_abundance_table_species_table.txt ? @fbeghini
I'm having this issue too. What's the fix? I am unable to recreate the heatmap in the example. Even when adding the --no_fclustering and --no_sclustering. Thank you, Rene
I am also having this issue
It confused me many days. Does someone have any solutions? Thanks, Bai
I also encountered this error. I was able to run successfully when I turned off clustering (--no_fclustering and --no_sclustering). This error may occur if the samples contain mostly 0's. You can avoid this by adding a very small value (e.g. 0.01) to all samples compared to the data.
maybee you want to check: https://forum.biobakery.org/t/hclust2-py-error-distance-matrix-finite-values/1732/2
I would appreciate if somebody can tell me if the change is valid or not. Eric
Just a note that I also see this, including with the example data that comes with this repo using the run.sh
script. @EricDeveaud's changes (linked above) do seem to progress past the issue, but I run into another downstream problem in matplotlib
:
% ./hclust2.py \
-i examples/HMP-MetaPhlAn/HMP.species.txt \
-o HMP.sqrt_scale.png \
--skip_rows 1 \
--ftop 50 \
--f_dist_f correlation \
--s_dist_f braycurtis \
--cell_aspect_ratio 9 \
-s --fperc 99 \
--flabel_size 4 \
--metadata_rows 2,3,4 \
--legend_file HMP.sqrt_scale.legend.png \
--max_flabel_len 100 \
--metadata_height 0.075 \
--minv 0.01 \
--no_slabels \
--dpi 300 \
--slinkage complete
Traceback (most recent call last):
File "/Users/cjfields/research/biotech/swanson/2022-August-metagenome/src/hclust2/./hclust2.py", line 1244, in <module>
hclust2_main()
File "/Users/cjfields/research/biotech/swanson/2022-August-metagenome/src/hclust2/./hclust2.py", line 1240, in hclust2_main
hm.draw()
File "/Users/cjfields/research/biotech/swanson/2022-August-metagenome/src/hclust2/./hclust2.py", line 1028, in draw
im = ax_hm.imshow(
File "/Users/cjfields/miniforge3/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 454, in wrapper
return func(*args, **kwargs)
File "/Users/cjfields/miniforge3/lib/python3.9/site-packages/matplotlib/__init__.py", line 1423, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "/Users/cjfields/miniforge3/lib/python3.9/site-packages/matplotlib/axes/_axes.py", line 5577, in imshow
im._scale_norm(norm, vmin, vmax)
File "/Users/cjfields/miniforge3/lib/python3.9/site-packages/matplotlib/cm.py", line 405, in _scale_norm
raise ValueError(
ValueError: Passing a Normalize instance simultaneously with vmin/vmax is not supported. Please pass vmin/vmax directly to the norm when creating it.
Using:
python=3.9.10
matplotlib==3.6.0
numpy==1.23.1
pandas==1.5.0
scipy==1.9.1
setuptools==60.9.3
Hi,
This worked for me:
https://forum.biobakery.org/t/hclust2-py-error-distance-matrix-finite-values/1732
Just modify the script in the __init__
function in line 370 should do it.
Regarding your specific error, I think that if you discard the --minv
parameter it should work.
Cheers, J
I have the same problem, I am using it in a cluster whose system is similar to Linux, do you know how to solve it?
Hello,
This is error is related to #1. Once that issue was solved and @fbeghini closed it, I reinstalled hclust2 in a conda environment, as follows:
Using the same merged abundance file mentioned in #1 (created using metaphlan3), I ran the following command:
$ hclust2.py --in merged_abundance_table.txt -l --out heatmap.png
And got the following error:
This being a different error than before, I assume that hclust2 on bioconda channel had been updated to fix issue #1. In case I was wrong, I followed the advise @fbeghini posted in #1 to manually remove the first line (containing the string
#mpa_v30_CHOCOPhlAn_201901
) and the column,NCBI_tax_id
, but got the same error.Looking over the matrix in the merged abundance file, it is not immediately clear why the matrix would contain non-finite values.