cafferychen777 / MicrobiomeStat

Track, Analyze, Visualize: Unravel Your Microbiome's Temporal Pattern with MicrobiomeStat
https://www.microbiomestat.wiki/
31 stars 4 forks source link

Error "contrasts can be applied only to factors with 2 or more levels" while using generate_alpha_change_test_pair() #29

Open BiggusDickus666 opened 7 months ago

BiggusDickus666 commented 7 months ago

Hello, First of all, thank you for your work in MicrobiomeStat package. I have successfully created the MicrobiomeStat data object importing several .qza from QIIME2. While using generate_alpha_change_test_pair() , the following error occurs: Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels. As far as I know this is caused by the presence of a column containing only one value (hence generate_alpha_change_test_pair() cannot fit the data into a linear regression model)

I have tried to troubleshoot the problem running : values_count <- sapply(lapply(data.obj, unique), length) values_count to check for columns in my data object containing only 1 value but have not work so far.

This is the workflow I have used (I am afraid I cannot attach any .qza for you to reproduce the analysis, but If necessary I will be happy to provide it via email...)

Define paths to your QIIME2 files

otuqza_file <- "/path/merged-table.qza" taxaqza_file <- "/path/taxonomy.qza" sample_file <- "/path/metadata2.csv" treeqza_file <- "/path/rooted-tree.qza"

Import QIIME2 data into a MicrobiomeStat data object

data.obj <- mStat_import_qiime2_as_data_obj( otu_qza = otuqza_file, taxa_qza = taxaqza_file, sam_tab = sample_file, tree_qza = treeqza_file )

Create alpha box plot

AlphaChangeTest <- generate_alpha_change_test_pair( data.obj, alpha.obj = NULL, alpha.name = "shannon", depth = NULL, time.var = "time", subject.var = "subject", group.var = "group", adj.vars = "Plastic.Type", change.base ="1", alpha.change.func = "log fold change" )

Again, thank you very much for your work

cafferychen777 commented 7 months ago

Dear Victor,

Thank you for reaching out and providing a detailed account of the issue you're encountering with the MicrobiomeStat package. I appreciate your efforts in troubleshooting and the workflow information you've shared.

The error message you're seeing, "contrasts can be applied only to factors with 2 or more levels," indeed suggests that one of the variables in your data set does not vary (i.e., it has only one level). This situation prevents the statistical model from being fitted properly, as it requires variability in each factor to compute contrasts.

Your approach to identifying the column with a single unique value is a good start. However, since the issue persists, there might be nuances in the data or the specific function parameters causing the problem.

To assist you more effectively, it would be helpful to examine your .qza files. Please send them to my email at cafferychen7850@gmail.com. With access to your data, I'll be able to replicate your analysis, pinpoint the exact issue, and provide a more targeted solution.

Thank you once again for your contribution to improving the MicrobiomeStat package. I'm looking forward to your email and am committed to resolving this issue promptly.

Best regards, Chen YANG

cafferychen777 commented 7 months ago

Dear @BiggusDickus666,

Upon further review of the error you encountered with the generate_alpha_change_test_pair() function in the MicrobiomeStat package, I have identified the root cause of the problem. It appears that the issue stems from the structure of your metadata, specifically how subjects are identified across different time points.

In your provided metadata, each subject ID is unique to a single time point, which means that the function cannot pair data across two different time points for the same subject. For the generate_alpha_change_test_pair() function to work as intended, each subject should have data entries at two (or more) time points under a consistent subject ID, while sample IDs can remain unique.

Here is an example of how your metadata should be structured:

data.obj$meta.dat
          subject time Probiotic Plastic.Type group
sample-1       S1    1        No   No plastic     1
sample-2       S1    2        No   No plastic     1
sample-3       S2    1       Yes         LDPE    2
sample-4       S2    2       Yes         LDPE    2
...

In this corrected format, subject serves as a consistent identifier for each individual across multiple time points, while each row (or sample ID) represents a unique combination of subject, time, and other variables.

To resolve the issue, please adjust your metadata to ensure that each subject ID corresponds to multiple entries if there are data for that subject at different time points. This approach will enable the generate_alpha_change_test_pair() function to correctly identify and analyze the paired data.

I hope this clarification helps. If you need further assistance adjusting your data or have any more questions, please feel free to reach out.

Thank you for your dedication to using the MicrobiomeStat package, and I apologize for any inconvenience this issue may have caused.

Best regards,

Chen YANG

Screenshot 2024-02-10 at 22 58 44 Screenshot 2024-02-10 at 22 58 53