biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
58 stars 25 forks source link

q2-Songbird: plotting regression-biplots issue #54

Open jaybake5 opened 5 years ago

jaybake5 commented 5 years ago

Hi Jamie, Unless I am misunderstanding something, I think there might be an issue with the QIIME2 Songbird plugin regarding making the regression biplot. When I use your RedSea tutorial dataset, the qiime songbird multinomial function appears to run correctly, but when I go to the plug the regression-biplot.qza into anime emperor biplot I get an error:

(qiime2-2019.1) jobakerlt-osx:march-q2songbird jobaker$ qiime songbird multinomial \

--i-table redsea.qza \ --m-metadata-file redsea_metadata.txt \ --p-formula "Depth+Temperature+Salinity+Oxygen+Fluorescence+Nitrate" \ --o-differentials redsea_differentials.qza \ --o-regression-stats redsea_regression-stats.qza \ --o-regression-biplot redsea_regression-biplot.qza \ --verbose 2019-05-07 20:40:05.079391: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2019-05-07 20:40:05.079660: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 8. Tune using inter_op_parallelism_threads for best performance. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [00:10<00:00, 731.36it/s] Saved FeatureData[Differential] to: redsea_differentials.qza Saved SampleData[SongbirdStats] to: redsea_regression-stats.qza Saved PCoAResults % Properties(['biplot']) to: redsea_regression-biplot.qza (qiime2-2019.1) jobakerlt-osx:march-q2songbird jobaker$ qiime emperor biplot \ --i-biplot redsea_regression-biplot.qza \ --m-sample-metadata-file redsea_metadata.txt \ --p-number-of-features 7 \ --o-visualization redsea_emperor-biplot Plugin error from emperor:

None of the sample identifiers match between the metadata and the coordinates. Verify that you are using metadata and coordinates corresponding to the same dataset.

Debug info has been saved to /var/folders/rr/2hbbkrqs5mn86rnt2s4yywsr0003kj/T/qiime2-q2cli-err-y6jdxgcr.log

I noticed this error initially when running my own dataset, using my same OTU table and metadata table that had worked with making a biplot from DEICODE (i.e. the biplot didn't give a similar error then). Since your RedSea tutorial data had the same issue I don't actually think it's a lack of match up between the OTU table and metadata. If it helps, my qiime info is here: System versions Python version: 3.6.5 QIIME 2 release: 2019.1 QIIME 2 version: 2019.1.0 q2cli version: 2019.1.0

Installed plugins alignment: 2019.1.0 composition: 2019.1.0 cutadapt: 2019.1.0 dada2: 2019.1.0 deblur: 2019.1.0 deicode: 0.1.6 demux: 2019.1.0 diversity: 2019.1.0 emperor: 2019.1.0 feature-classifier: 2019.1.0 feature-table: 2019.1.0 fragment-insertion: 2019.1.0 gneiss: 2019.1.0 longitudinal: 2019.1.0 metadata: 2019.1.0 phylogeny: 2019.1.0 quality-control: 2019.1.0 quality-filter: 2019.1.0 rankratioviz: 0.0.0 sample-classifier: 2019.1.0 songbird: 0.8.2 taxa: 2019.1.0 types: 2019.1.0 vsearch: 2019.1.0

Thanks!

mortonjt commented 5 years ago

I bet that it is stemming from the confusion due to sample metadata

The actual command should be this

qiime emperor biplot \
> --i-biplot regression-biplot.qza \
> --m-sample-metadata-file data/redsea/feature_metadata.txt \
> --p-ignore-missing-samples \
> --p-number-of-features 7 \
> --o-visualization emperor-biplot

What's actually happening here is that the sample-metadata doesn't refer to the actual metadata for the samples, it refers to the feature metadata (i.e. taxonomies, genome information, ...)

I agree - the parameters are a little confusing. Maybe a quick fix is to introduce additional biplot commands in Emperor to help differentiate between the different types of metadata? I'm open for suggestions. CC @ElDeveloper

ElDeveloper commented 5 years ago

I agree, perhaps the parameter names should be more specific to the "arrows" and "points" and instead be --m-point-metadata-file and --m-arrow-metadata-file. With all the different types of data that can be plotted as a biplot making these parameters "sample metadata" and "feature metadata" can be a bit confusing for some use-cases.

jaybake5 commented 5 years ago

Thanks! So, I got it to work with your tutorial by using the feature metadata file instead of the sample metadata file, but with my data, I keep getting the same error.

In my dataset, the features are species names. I don't really have any metadata for my species, so I just made a tsv with the species names in the first column and kingdom (bacteria or virus) in the 2nd column (just to have something). When giving the emperor biplot this table, I get the same error. As far as I can tell, they match up exactly, but I still get the error...any ideas?
I've attached my two metadata files and otu table. Thanks so much for your help.

otu_table.txt feature_metadata.txt sample_metadata.txt

mortonjt commented 5 years ago

Ok. Can you post your biplot as well so that I can reproduce the error?

On Wed, May 8, 2019, 5:24 PM Jonathon L. Baker notifications@github.com wrote:

Thanks! So, I got it to work with your tutorial by using the feature metadata file instead of the sample metadata file, but with my data, I keep getting the same error.

In my dataset, the features are species names. I don't really have any metadata for my species, so I just made a tsv with the species names in the first column and kingdom (bacteria or virus) in the 2nd column (just to have something). When giving the emperor biplot this table, I get the same error. As far as I can tell, they match up exactly, but I still get the error...any ideas? I've attached my two metadata files and otu table. Thanks so much for your help.

otu_table.txt https://github.com/biocore/songbird/files/3159258/otu_table.txt feature_metadata.txt https://github.com/biocore/songbird/files/3159259/feature_metadata.txt sample_metadata.txt https://github.com/biocore/songbird/files/3159260/sample_metadata.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/issues/54#issuecomment-490656395, or mute the thread https://github.com/notifications/unsubscribe-auth/AA75VXJKTUGOGPVRKM4K4PLPUNAKNANCNFSM4HLTLSZQ .

jaybake5 commented 5 years ago

You mean the output from giime songbird multinomial? Let me know if you mean something else

regression-biplot.qza.zip