biocore / DEICODE

Robust Aitchison PCA from sparse count data
Other
33 stars 17 forks source link

arrows appear to be biased? #56

Closed scher-lab closed 4 years ago

scher-lab commented 4 years ago

Hello, I'm trying to run Deicode on my dataset. Below is my plot

image

Are all of arrows expected to be only be located on one half of PC3?

Below is my command for reference. Thanks!

qiime deicode rpca \
      --i-table /Users/Lyusik/Desktop/skin.pso.psa/outputs/03_filtered_tables_f/samples-no-controls-silva-primers-99-filtered-table.qza \
      --p-min-feature-count 10 \
      --p-min-sample-count 500 \
      --o-biplot /Users/Lyusik/Desktop/skin.pso.psa/jobs/04_deicode_f/samples-no-controls-silva-primers-99-filtered-ordination.qza \
      --o-distance-matrix /Users/Lyusik/Desktop/skin.pso.psa/jobs/04_deicode_f/samples-no-controls-silva-primers-99-filtered-distance.qza \
      --verbose

# create the biplot                                                                                                                                                             
qiime emperor biplot \
      --i-biplot /Users/Lyusik/Desktop/skin.pso.psa/jobs/04_deicode_f/samples-no-controls-silva-primers-99-filtered-ordination.qza \
      --m-sample-metadata-file /Users/Lyusik/Desktop/skin.pso.psa/inputs/01_map/map_skin.pso.psa_no.controls.txt \
      --m-feature-metadata-file /Users/Lyusik/Desktop/skin.pso.psa/outputs/04_taxonomy_f/taxonomy-silva-primers-99.qza \
      --o-visualization /Users/Lyusik/Desktop/skin.pso.psa/jobs/04_deicode_f/samples-no-controls-silva-primers-99-filtered-biplot.qzv \
      --p-ignore-missing-samples \
      --p-number-of-features 50 \
      --verbose
cameronmartino commented 4 years ago

Hi @scher-lab,

This may be related to a bug that was just patched in PR #54. Could you please try re-installing with the dev. version by doing:

pip install git+https://github.com/biocore/DEICODE

within your QIIME2 environment. Then rerun the commands you listed above and that may fix the problem.

jaybake5 commented 4 years ago

Screen Shot 2020-03-03 at 2 27 43 PM

The issue described above sounds like the issue I noticed a few months ago, back when I asked about using DEICODE where the features are enzymes or functions, rather than taxa. I tried again using the dev. version following the installation described above and got the same result (pictured above). Any thoughts or advice about what this means would be greatly appreciated. The otu table and metadata file I used to generate the Biplot pictured above are attached.

metadata3.txt

otu_table_filtered4.qza.zip

cameronmartino commented 4 years ago

Hi @jaybake5,

I tried running the data and after increasing the --p-n-components to 8 and added a frequency filter --p-min-feature-frequency of 10 along with decreasing -p-min-feature-count & --p-min-sample-count to zero. This solved the problem:

Screen Shot 2020-03-04 at 9 22 46 AM

I think the important thing here is the --p-n-components increased rank. The default parameters are set fo 16S or shallow shogun abundance data and I think they just need slight tweaking for this data type. You could also try the new auto_rpca command that estimates the rank from the data. Let me know if you have any questions. Thanks for reporting!

cameronmartino commented 4 years ago

Hi all, I am going to close this for now. Feel free to re-open if you have more questions. Thanks!