Open pkirti33 opened 4 months ago
Hi @pkirti33,
Thank you for bringing this issue to my attention. Indeed, it was a peculiar error where the feature.tab
was not recognized as a matrix after applying the mStat_normalize_data()
function with the Rarefy-TSS method. Although I couldn't pinpoint the exact cause of this anomaly, I've implemented a fix by adding a forceful conversion to matrix at the end of the normalization process.
I've already pushed the update to the GitHub repository. It should be available in a few hours. Please update the MicrobiomeStat package then, and let me know if the problem persists or if there's anything else I can help you with.
Best regards, Chen YANG
Hello, Thank you for your prompt reply and help! I tried re-running my code, but the issue has not resolved itself. My steps are below:
Detach and re-install MicrobiomeStat
detach("package:MicrobiomeStat", unload = TRUE)
devtools::install_github("cafferychen777/MicrobiomeStat")
library(MicrobiomeStat)
Make the microbiomeData object:
MicrobiomeData <- list(feature.tab = otu_table_matrix,
meta.dat = metadata_df,
feature.ann = taxonomy_matrix)
MicrobiomeData <- mStat_normalize_data(data.obj = MicrobiomeData, method = "Rarefy-TSS")
MicrobiomeData$data.obj.norm$feature.tab <- as.matrix(MicrobiomeData$data.obj.norm$feature.tab)
mStat_validate_data(MicrobiomeData)
The error is as follows: Rule 1 passed: data.obj is a list. Rule 2 passed: meta.dat has been converted to a data.frame. Rule 3 passed: The row names of feature.tab match the row names of feature.ann. Rule 4 passed: The order of rows in meta.dat has been adjusted to match feature.tab. Error in mStat_validate_data(MicrobiomeData) : Rule 5 failed: feature.tab should be a matrix.
Hi pkirti33,
Thanks for following up and providing more details. I apologize that the issue is still not resolved. Based on the error message, it seems the root cause is that the feature.tab
object is not being recognized as a matrix after the mStat_normalize_data()
step, even when converting it explicitly using as.matrix()
.
One potential workaround is to skip the explicit normalization step. In the current version of MicrobiomeStat, almost all the functions perform "Rarefy-TSS" normalization by default under the hood. So you may be able to get the expected results without needing to call mStat_normalize_data()
directly.
Try this simplified workflow and see if it resolves the validation error:
MicrobiomeData <- list(feature.tab = otu_table_matrix,
meta.dat = metadata_df,
feature.ann = taxonomy_matrix)
mStat_validate_data(MicrobiomeData)
If the issue persists, please let me know. I'll do some further testing on my end to identify the underlying problem with mStat_normalize_data()
converting the data type. In the meantime, hopefully skipping that step provides a temporary solution.
Best regards, Caffery
Thank you for your help! I'll use your recommended solution for now.
Hi all, I am new in MicrobiomeStat. I am having the same problem as @pkirti33.
"Error in mStat_validate_data(MicrobiomeData_rare) : Rule 5 failed: feature.tab should be a matrix"
Is there any update or some alternative for Rarefy-TSS?
Thank you so much! Carla.
Hi @ctmlab4,
Thanks for reaching out regarding the issue you encountered with the mStat_validate_data()
function after using mStat_normalize_data()
with the "Rarefy-TSS" method.
As a workaround for now, you have two options:
You can directly run other functions without any additional conversions.
Alternatively, after running the mStat_normalize_data()
function, you can convert the feature.tab
element of the returned object to a matrix using as.matrix()
. Here's an example:
MicrobiomeData_rare <- mStat_normalize_data(data.obj = MicrobiomeData, method = "Rarefy-TSS")
MicrobiomeData_rare$feature.tab <- as.matrix(MicrobiomeData_rare$feature.tab)
mStat_validate_data(MicrobiomeData_rare)
Either of these approaches should resolve the issue and allow the mStat_validate_data()
function to pass all the validation rules.
We appreciate your patience and understanding. We are actively working on a more permanent solution to address this issue in a future update of the MicrobiomeStat package.
If you have any further questions or concerns, please don't hesitate to reach out.
Best regards, Caffery
Hi @cafferychen777,
I believe I am having a similar problem as the others above. I turned my phyloseq object to a data.obj:
data.obj <- mStat_convert_phyloseq_to_data_obj(physeq_final_100k)
Then I wanted to use the 'mStat_rarefy_data' command to a read depth of 100,000:
rarefied_data<- mStat_rarefy_data(data.obj = data.obj, depth = 100000)
Then made my rarefied_data object a matrix which passed all the rules with 'mStat_validate_data(rarefied_data)':
rarefied_data$feature.tab <- as.matrix(rarefied_data$feature.tab)
mStat_validate_data(rarefied_data)
Then I wanted to use 'mStat_calculate_alpha_diversity':
alpha_rarefied <- mStat_calculate_alpha_diversity(x = rarefied_data, alpha.name = c("shannon", "simpson", "observed_species"))
But I get the following error: "Error in colSums(x) : 'x' must be an array of at least two dimensions"
So then I try:
alpha_rarefied <- mStat_calculate_alpha_diversity(x = rarefied_data$feature.tab, alpha.name = c("shannon", "simpson", "observed_species"))
which looks like it runs properly, but when i run:
mStat_validate_data(alpha_rarefied)
it throws an error:
"Rule 1 passed: data.obj is a list.
Rule 2 passed: meta.dat has been converted to a data.frame.
Rule 3 passed: The row names of feature.tab match the row names of feature.ann.
Rule 4 passed: The order of rows in meta.dat has been adjusted to match feature.tab.
Error in mStat_validate_data(alpha_rarefied) :
Rule 5 failed: feature.tab should be a matrix."
I also see this problem being addressed in #7, however reading that issue did not help me understand my issue.
When I try another normalization method like "TSS":
TSS_data <- mStat_normalize_data(data.obj = data.obj, method = "TSS")
And I try to make it a matrix:
**note: to access the "feature.tab" i have to first go through "$data.obj.norm" then "$feature.tab"
TSS_data$data.obj.norm$feature.tab <- as.matrix(TSS_data$data.obj.norm$feature.tab)
mStat_validate_data(TSS_data)
'mStat_validate_data(TSS_data)' throws an error:
"Rule 1 passed: data.obj is a list. Rule 2 passed: meta.dat has been converted to a data.frame. Rule 3 passed: The row names of feature.tab match the row names of feature.ann. Rule 4 passed: The order of rows in meta.dat has been adjusted to match feature.tab. Error in mStat_validate_data(TSS_data) : Rule 5 failed: feature.tab should be a matrix."
How do I tweak my code to be able to use different normalization methods with mStat_calculate_alpha_diversity? Should I use one of the other alpha diversity commands? Thank you for your help.
MicrobiomeStat version 1.2.0 R version 4.3.2
Hi @bark9299 @pkirti33 @ctmlab4 ,
I think I may have found the cause of the error. After normalizing the data using mStat_normalize_data()
, you should use the $data.obj.norm
element of the returned object instead of the original data.obj
. For example:
norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm
Then, in subsequent function calls, use norm.data.obj
instead of data.obj
.
The reason for this is that during the normalization process, a new data.obj.norm
(in the form of a list) is generated and stored within the original data.obj
. Therefore, you need to replace the usage of the original data.obj
with the newly generated data.obj.norm
, rather than only using the new feature.tab
.
So your workflow should look something like this:
data.obj <- mStat_convert_phyloseq_to_data_obj(physeq_final_100k)
norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm
mStat_validate_data(norm.data.obj)
alpha_diversity <- mStat_calculate_alpha_diversity(x = norm.data.obj$feature.tab, alpha.name = c("shannon", "simpson", "observed_species"))
By using norm.data.obj
consistently after the normalization step, the mStat_validate_data()
function should pass all validation rules, and the mStat_calculate_alpha_diversity()
function should work as expected.
Please give this a try and let me know if it resolves the issues you were encountering. If you have any further questions or need additional assistance, don't hesitate to ask.
Best regards, Caffery
Hi @bark9299 @pkirti33 @ctmlab4 ,
I think I may have found the cause of the error. After normalizing the data using
mStat_normalize_data()
, you should use the$data.obj.norm
element of the returned object instead of the originaldata.obj
. For example:norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm
Then, in subsequent function calls, use
norm.data.obj
instead ofdata.obj
.The reason for this is that during the normalization process, a new
data.obj.norm
(in the form of a list) is generated and stored within the originaldata.obj
. Therefore, you need to replace the usage of the originaldata.obj
with the newly generateddata.obj.norm
, rather than only using the newfeature.tab
.So your workflow should look something like this:
data.obj <- mStat_convert_phyloseq_to_data_obj(physeq_final_100k) norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm mStat_validate_data(norm.data.obj) alpha_diversity <- mStat_calculate_alpha_diversity(x = norm.data.obj$feature.tab, alpha.name = c("shannon", "simpson", "observed_species"))
By using
norm.data.obj
consistently after the normalization step, themStat_validate_data()
function should pass all validation rules, and themStat_calculate_alpha_diversity()
function should work as expected.Please give this a try and let me know if it resolves the issues you were encountering. If you have any further questions or need additional assistance, don't hesitate to ask.
Best regards, Caffery
Hi Caffery,
I tried it and I could do it without any problems! Thank you very much for your help!
Kind regards, Carla.
Hi @bark9299 @pkirti33 @ctmlab4 ,
I think I may have found the cause of the error. After normalizing the data using
mStat_normalize_data()
, you should use the$data.obj.norm
element of the returned object instead of the originaldata.obj
. For example:norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm
Then, in subsequent function calls, use
norm.data.obj
instead ofdata.obj
.The reason for this is that during the normalization process, a new
data.obj.norm
(in the form of a list) is generated and stored within the originaldata.obj
. Therefore, you need to replace the usage of the originaldata.obj
with the newly generateddata.obj.norm
, rather than only using the newfeature.tab
.So your workflow should look something like this:
data.obj <- mStat_convert_phyloseq_to_data_obj(physeq_final_100k) norm.data.obj <- mStat_normalize_data(data.obj, "TSS")$data.obj.norm mStat_validate_data(norm.data.obj) alpha_diversity <- mStat_calculate_alpha_diversity(x = norm.data.obj$feature.tab, alpha.name = c("shannon", "simpson", "observed_species"))
By using
norm.data.obj
consistently after the normalization step, themStat_validate_data()
function should pass all validation rules, and themStat_calculate_alpha_diversity()
function should work as expected.Please give this a try and let me know if it resolves the issues you were encountering. If you have any further questions or need additional assistance, don't hesitate to ask.
Best regards, Caffery
Hi @cafferychen777 ,
That worked for me as well. Thank you for your help and speedy reply!
Best,
E
Describe the Bug When I use mStat_normalize_data() with the Rarefy-TSS method, the mStat_validate_data() function no longer passes because it doesn't recognize feature.tab as a matrix (Rule 5). When I don't use mStat_normalize_data(), all the tests pass.
Example
The following code fails at step 5 (Rule 5 failed: feature.tab should be a matrix.)
However, the following code passes all validations. Furthermore, when I rarefy the data with mStat_rarefy_data(data.obj = MicrobiomeData) prior to validation, all validations pass.
Environment Information: