picrust / picrust2

Code, unit tests, and tutorials for running PICRUSt2
GNU General Public License v3.0
325 stars 104 forks source link

pathway_pipeline.py operation failed on a column #298

Closed kingabo closed 1 year ago

kingabo commented 1 year ago

Hi!

I run picrust2.5.1 using this code:

place_seqs.py -s seq.fa -o out.tre -p 30 --intermediate intermediate/place_seqs

hsp.py -i 16S -t out.tre -o marker_predicted_and_nsti.tsv.gz -p 30 -n

hsp.py -i EC -t out.tre -o EC_predicted.tsv.gz -p 30

metagenome_pipeline.py -i gr.biom -m marker_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out

pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz -o pathways_out -p 1

After the analysis I got this warning:

/soft/picrust2/picrust2-2.5.1/picrust2/pathway_pipeline.py:642: FutureWarning: The operation <function sum at 0x1489d7756160> failed on a column. If any error is raised, this will raise an exception in a future version of pandas. Drop these columns to avoid this warning.

Finally I got the unstratified and the stratified output, but I am bot sure if I can use it. Is it valid?

I also want to ask about the stratified table with predicted MetaCyc pathways. In the tutorial it is said that pathways table is similar to the EC table. However, the table with predicted pathways lacks the last column "norm_taxon_function_contrib" which gives proportional contribution of EC . Is this because such information is calculated only if I use --per_sequence_contrib option? If I use this option and I get the stratified table with pathways per sequence, is the last column also the proportional contribution of the given pathways? Is this last column analogous to the norm_taxon_function_contribution? Could you please clarify this?

Many thanks, Kinga

gavinmdouglas commented 1 year ago

Hi Kinga,

Thanks for reporting this warning (or possible error?). It's hard to say whether there is a problem with the data, although I don't think so as this is a FutureWarning (i.e., a functionality will be changing). Would you mind providing me with the output of conda list in the environment you're working with? I think this is a new warning in newer versions of pandas.

Yes, you would only get the norm_taxon_function_contrib column with the pathway output if you use the --per_sequence_contrib option. And yes the interpretation would be the same as for the EC contributional output when this option is used. When the option is not used, pathway abundances correspond to the "community-wide" pathways. Essentially assuming that all genes / reactions can interact freely, regardless of which taxon encodes them. This is a very common assumption in microbiome data analysis, although I think it is seriously flawed.

Cheers,

Gavin

kingabo commented 1 year ago

Hi Gavin,

Thank you so much for the explanation!

I'm sending the output of conda list:

packages in environment at /soft/miniconda/4/envs/picrust2_5_1:

#

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge _r-mutex 1.0.1 anacondar_1 conda-forge alsa-lib 1.2.8 h166bdaf_0 conda-forge attrs 22.2.0 pyh71513ae_0 conda-forge binutils_impl_linux-64 2.40 hf600244_0 conda-forge biom-format 2.1.14 py38h1de0b5d_2 conda-forge brotlipy 0.7.0 py38h0a891b7_1005 conda-forge bwidget 1.9.14 ha770c72_1 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2022.12.7 ha878542_0 conda-forge cached-property 1.5.2 hd8ed1ab_1 conda-forge cached_property 1.5.2 pyha770c72_1 conda-forge cairo 1.16.0 ha61ee94_1014 conda-forge certifi 2022.12.7 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py38h4a40e3a_3 conda-forge charset-normalizer 2.1.1 pyhd8ed1ab_0 conda-forge click 8.1.3 unix_pyhd8ed1ab_2 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge coverage 7.2.2 py38h1de0b5d_0 conda-forge cryptography 40.0.1 py38h3d167d9_0 conda-forge curl 7.86.0 h2283fc2_1 conda-forge cython 0.29.33 py38h8dc9893_0 conda-forge dendropy 4.5.2 pyh3252c3a_0 bioconda epa-ng 0.3.8 hd03093a_2 bioconda exceptiongroup 1.1.1 pyhd8ed1ab_0 conda-forge expat 2.5.0 h27087fc_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.14.2 h14ed4e7_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge freetype 2.12.1 hca18f0e_1 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge gappa 0.8.0 hd03093a_1 bioconda gcc_impl_linux-64 12.2.0 hcc96c02_19 conda-forge gettext 0.21.1 h27087fc_0 conda-forge gfortran_impl_linux-64 12.2.0 h55be85b_19 conda-forge giflib 5.2.1 h0b41bf4_3 conda-forge glpk 4.65 h9202a9a_1004 conda-forge gmp 6.2.1 h58526e2_0 conda-forge graphite2 1.3.13 h58526e2_1001 conda-forge gsl 2.7 he838d99_0 conda-forge gxx_impl_linux-64 12.2.0 hcc96c02_19 conda-forge h5py 3.8.0 nompi_py38hd5fa8ee_100 conda-forge harfbuzz 6.0.0 h8e241bc_0 conda-forge hdf5 1.12.2 nompi_h4df4325_100 conda-forge hmmer 3.1b2 3 bioconda icu 70.1 h27087fc_0 conda-forge idna 3.4 pyhd8ed1ab_0 conda-forge iniconfig 2.0.0 pyhd8ed1ab_0 conda-forge jinja2 3.1.2 pyhd8ed1ab_1 conda-forge joblib 1.2.0 pyhd8ed1ab_0 conda-forge jpeg 9e h0b41bf4_3 conda-forge kernel-headers_linux-64 2.6.32 he073ed8_15 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.19.3 h08a2579_0 conda-forge lcms2 2.15 hfd0df8a_0 conda-forge ld_impl_linux-64 2.40 h41732ed_0 conda-forge lerc 4.0.0 h27087fc_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libcups 2.3.3 h3e49a29_2 conda-forge libcurl 7.86.0 h2283fc2_1 conda-forge libdeflate 1.17 h0b41bf4_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-devel_linux-64 12.2.0 h3b97bd3_19 conda-forge libgcc-ng 12.2.0 h65d4601_19 conda-forge libgfortran-ng 12.2.0 h69a702a_19 conda-forge libgfortran5 12.2.0 h337968e_19 conda-forge libglib 2.74.1 h606061b_1 conda-forge libgomp 12.2.0 h65d4601_19 conda-forge libiconv 1.17 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libnghttp2 1.52.0 h61bc06f_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libpng 1.6.39 h753d276_0 conda-forge libsanitizer 12.2.0 h46fd767_19 conda-forge libsqlite 3.40.0 h753d276_0 conda-forge libssh2 1.10.0 hf14f497_3 conda-forge libstdcxx-devel_linux-64 12.2.0 h3b97bd3_19 conda-forge libstdcxx-ng 12.2.0 h46fd767_19 conda-forge libtiff 4.5.0 h6adf6a1_2 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libwebp-base 1.3.0 h0b41bf4_0 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libxml2 2.10.3 hca2bb57_4 conda-forge libzlib 1.2.13 h166bdaf_4 conda-forge make 4.3 hd18ef5c_1 conda-forge markupsafe 2.1.2 py38h1de0b5d_0 conda-forge ncurses 6.3 h27087fc_1 conda-forge nlopt 2.7.1 py38hca016a5_3 conda-forge numpy 1.24.2 py38h10c12cc_0 conda-forge openjdk 17.0.3 h58dac75_5 conda-forge openssl 3.1.0 h0b41bf4_0 conda-forge packaging 23.0 pyhd8ed1ab_0 conda-forge pandas 1.5.3 py38hdc8b05c_0 conda-forge pango 1.50.14 hd33c08f_0 conda-forge pcre2 10.40 hc3806b6_0 conda-forge picrust2 2.5.1 dev_0 pip 23.0.1 pyhd8ed1ab_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge platformdirs 3.2.0 pyhd8ed1ab_0 conda-forge pluggy 1.0.0 pyhd8ed1ab_5 conda-forge pooch 1.7.0 pyha770c72_3 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyopenssl 23.1.0 pyhd8ed1ab_0 conda-forge pysocks 1.7.1 pyha2e5f31_6 conda-forge pytest 7.2.2 pyhd8ed1ab_0 conda-forge pytest-cov 4.0.0 pyhd8ed1ab_0 conda-forge python 3.8.16 he550d4f_1_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 3_cp38 conda-forge pytz 2023.2 pyhd8ed1ab_0 conda-forge r-base 4.1.3 h2f963a2_5 conda-forge r-castor 1.7.2 r41h03ef668_0 conda-forge r-lattice 0.20_45 r41h06615bd_1 conda-forge r-matrix 1.5_3 r41h5f7b363_0 conda-forge r-naturalsort 0.1.3 r41hc72bb7e_1004 conda-forge r-nloptr 2.0.3 r41hb13c81a_1 conda-forge r-rcpp 1.0.10 r41h38f115c_0 conda-forge r-rcppeigen 0.3.3.9.3 r41h9f5de39_0 conda-forge r-rspectra 0.16_1 r41h9f5de39_1 conda-forge readline 8.2 h8228510_1 conda-forge requests 2.28.2 pyhd8ed1ab_0 conda-forge scipy 1.10.1 py38h10c12cc_0 conda-forge sed 4.8 he412f7d_0 conda-forge sepp 4.3.10 py38h3252c3a_2 bioconda setuptools 67.6.0 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge sysroot_linux-64 2.12 he073ed8_15 conda-forge tk 8.6.12 h27826a3_0 conda-forge tktable 2.10 hb7b940f_3 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge typing-extensions 4.5.0 hd8ed1ab_0 conda-forge typing_extensions 4.5.0 pyha770c72_0 conda-forge urllib3 1.26.15 pyhd8ed1ab_0 conda-forge wheel 0.40.0 pyhd8ed1ab_0 conda-forge xorg-fixesproto 5.0 h7f98852_1002 conda-forge xorg-inputproto 2.3.2 h7f98852_1002 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.8.4 h0b41bf4_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h0b41bf4_2 conda-forge xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge xorg-libxi 1.7.10 h7f98852_0 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-libxt 1.2.1 h7f98852_2 conda-forge xorg-libxtst 1.2.3 h7f98852_1002 conda-forge xorg-recordproto 1.14.2 h7f98852_1002 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zlib 1.2.13 h166bdaf_4 conda-forge zstd 1.5.2 h3eb15da_6 conda-forge

Cheers,

Kinga

gavinmdouglas commented 1 year ago

Hi there,

This warning should be fixed now (in the current development version), and did not appear to affect the output fortunately anyway.

Cheers,

Gavin

kingabo commented 1 year ago

Hi Gavin,

That's great news! Many thanks for your support!

Cheers, Kinga