Closed renzhezhu677 closed 6 years ago
Hi @renzhezhu677 ,
1.Can we compare intensities of different featrues in the same sample?Whether different features intensity are comparable?If not, can we solve this problem by Z-score or other normalization methods for intensity normalization?
In general, the intensities of different features from the same sample are not comparable. I don't think Z-score transformation can solve this problem. I don't know whether there is any normalization method that can do this.
2.Whether the features in positive and negative ion modes can be combined together in data processing(from missing value imputation and normlization to downstream analysis)?If not, can we correct features in two different ion modes to the same level by any method ?
I recommend that you perform missing value imputation and normalization for positive and negative feature sets separately. Because the two feature sets are generated separately and they may have very different missing value distribution and intensity distribution. You can combine the results from the two feature sets for pathway analysis and biomarker discovery.
Bo
Thanks for your immediate reply! @wenbostar I have another two questions for you:
p <- metaXpipe(para,plsdaPara=plsdaPara,missValueRatioQC = 0.5, missValueRatioSample = 0.8,cvFilter=0.3, remveOutlier = TRUE,nor.order=1,doQA = TRUE, doROC = TRUE, qcsc = 1, pclean = FALSE, t = 1, scale = "pareto", nor.method="pqn", outTol=1.2)
@renzhezhu677 , you're welcome.
- Did metaX automatically perform scaling and transformation when drawing PCA and heatmap in QA report?
In metaX, plotPCA
is used for PCA analysis and plotHeatMap
is used for heatmap analysis. You can use ?plotPCA
and ?plotHeatMap
to find the usage of the two functions. For PCA analysis, parameter scale is used for specifying the scaling method. If you want to do log transformation, you can use function preProcess
in metaX. For heatmap analysis, the parameter log is used to control whether or not to do log2 transformation. You can also use function preProcess
to do that.
If did, are the methods of scaling and transformation specified by arguments 't' and 'scale' in function 'metaXpipe()', as I used metaXpipe() for data processing and a few statistical analysis. Here's my code for metaXpipe():
p <- metaXpipe(para,plsdaPara=plsdaPara,missValueRatioQC = 0.5, missValueRatioSample = 0.8,cvFilter=0.3, remveOutlier = TRUE,nor.order=1,doQA = TRUE, doROC = TRUE, qcsc = 1, pclean = FALSE, t = 1, scale = "pareto", nor.method="pqn", outTol=1.2)
For PCA analysis, the two parameters are used. But for heatmap analysis, as you see in the code metaXpipe:
fig <- plotHeatMap(pp,valueID="valueNorm",log=TRUE,rmQC=FALSE, scale="row", clustering_distance_rows="euclidean", clustering_distance_cols="euclidean", clustering_method="ward.D2", show_colnames=FALSE)
In default log2 transformation and scaling (scale="row": row-wise scaling, this is the parameter for function
pheatmap
) were performed. The parameters "t" and "scale" are not used for heatmap analysis inmetaXpipe
. If you don't like the default setting, you can firstly usepreProcess
to process the data and then useplotHeatMap
for the processed data.
It's so nice of you for your patience! @wenbostar Here's what I got from your reply:
plotHeatMap
, values of the parameter scale are limited to row, column and none, as inherited from pheatmap
. What still remains unclear to me is which scaling method(pareto, uv, vector or other methods) does row exactly refer to.And here's a another question for you:
I assume the confidence eclipse in PCA plots were drawn by R package eclipse
. If so, what values for arguments of eclipse
is defaulted in metaX
?
It seems that 'log' method in all parameters (of any function) refers to log2, instead of log10 or any other base number.
Yes. For heatmap analysis, I don't think you will find difference when you use log10 or other base number.
If I'd like to perform heatmap analysis using plotHeatMap, values of the parameter scale are limited to row, column and none, as inherited from pheatmap. What still remains unclear to me is which scaling method(pareto, uv, vector or other methods) does row exactly refer to.
It's "auto".
I assume the confidence eclipse in PCA plots were drawn by R package eclipse. If so, what values for arguments of eclipse is defaulted in metaX?
It's not R package eclipse. You can take a look at the code of plotPCA for the details: plotPCA.
Thank you again! @wenbostar On issue of combinating positive and negative ion modes, I agree with you that features in positive and negative ion modes can not be combined together in data processing, and these results can be combined for pathway analysis and biomarker discovery. However, after we scale positive and negative feature sets separately, can we perform hierarchical clustering on the combined set of two ion modes, in order to observe a global profile of all features?The reason why I insist on combining these two feature sets mainly is the combination of positive and negative feature sets seems be more efficient and can avoid troubles to integrate descriptions for two results from downstream analysis.
As I said before, you can try to combine the two datasets for PCA and heatmap analysis, after you do missing value imputation and normalization for the two datasets separately.
Hi again! Another question for you : How to export results of intermediate processes from metaX ? For my case, which is quite complicated due to a cross-species analysis demand, I need to export the ion intensity after missingvalue filtering and imputation, before normalization. Unfortunately, I couldn't find a function to do this in R document of metaX. Thanks!
Please find my answer here #6 . Thanks.
Hi,I have two questions about data-processing 0f metabolomic data,as follows: 1.Can we compare intensities of different featrues in the same sample?Whether different features intensity are comparable?If not, can we solve this problem by Z-score or other normalization methods for intensity normalization? 2.Whether the features in positive and negative ion modes can be combined together in data processing(from missing value imputation and normlization to downstream analysis)?If not, can we correct features in two different ion modes to the same level by any method ? Thanks a lot!