Alamar-Biosciences / NULISAseqR

NULISAseq R package
GNU General Public License v3.0
2 stars 1 forks source link

Not file in templates and Error in Rmarkdown knit #148

Open NikoLichi opened 5 months ago

NikoLichi commented 5 months ago

Hi there,

I am trying to use this package for a new data set but encountered data incomplete and some issues: 1. The demo file as instructed, is not found: data <- loadNULISAseq('/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton/detectability_P1_Tr03.xml')

2. When using the generating report also show an error:

   rmarkdown::render("/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton/skeleton.Rmd", params=list(dataDir="/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton", xmlFiles=c("detectability_P1_Tr03.xml"")))
    [Report](/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton/skeleton.html)

The error:

Error in yaml::yaml.load(..., eval.expr = TRUE) : 
  Scanner error: while scanning a simple key at line 71, column 1 could not find expected ':' at line 72, column 3
Calls: <Anonymous> ... parse_yaml_front_matter -> yaml_load -> <Anonymous>
Execution halted

All the best, Nicolas

jcbeer commented 5 months ago

Hi Nicolas,Maybe you need to change the file path to match where the detectability_P1_Tr03.xml  is stored on your computer. Same issue may be happening with the render command. You need to find where the package files are.We do have a tutorial script we will be adding to the package soon. We can add it next week probably. JoanneOn Mar 23, 2024, at 2:39 AM, NiKo LiChi @.***> wrote: Hi there, I am trying to use this package for a new data set but encountered data incomplete and some issues:

  1. The demo file as instructed, is not found: data <- loadNULISAseq('/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton/detectability_P1_Tr03.xml')
  2. When using the generating report also show an error: rmarkdown::render("/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton/skeleton.Rmd", params=list(dataDir="/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/NULISAseqR/rmarkdown/templates/nulisaseq/skeleton", xmlFiles=c("detectability_P1_Tr03.xml""))) Report

The error: Error in yaml::yaml.load(..., eval.expr = TRUE) : Scanner error: while scanning a simple key at line 71, column 1 could not find expected ':' at line 72, column 3 Calls: ... parse_yaml_front_matter -> yaml_load -> Execution halted

All the best, Nicolas

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

NikoLichi commented 5 months ago

Hi Joanne,

Thanks for your prompt reply. I didn't mention it, but I set the path to where the files are located on the computer and Server.

Inside the skeleton directory, There are only two files, covar.txt and skeleton.Rmd.

I tried to search all the directories inside the NULISAseqR R directory, and there is nothing with the named files.

Should I try something else?

Best, Nicolas

jcbeer commented 5 months ago

Hi Nicolas,I see. Maybe the files were removed from the repository. I will check it and get back to you. Is the NULISAseq data you have in xml or csv format? Right now the skeleton only works for xml files, so might not be that useful for your data.Joanne On Mar 24, 2024, at 1:37 AM, NiKo LiChi @.***> wrote: Hi Joanne, Thanks for your prompt reply. I didn't mention it, but I set the path to where the files are located on the computer and Server. Inside the skeleton directory, There are only two files, covar.txt and skeleton.Rmd. I tried to search all the directories inside the NULISAseqR R directory, and there is nothing with the named files. Should I try something else? Best, Nicolas

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

NikoLichi commented 5 months ago

Thanks for this, Joanne.

I hope you can find the files soon.

The data I received is in CSV format. However, I want to see which are the steps done in the Rmarkdown with the test file detectability_P1_Tr03.xm so I can reproduce them, and extract more info from the data.

Since the Rmarkdown is the only piece of help on this package, having it should be useful!

jcbeer commented 5 months ago

Hi again Nicolas, You can access the Rmd template in the inst/Rmarkdown/templates/nulisaseq/skeleton/skeleton.Rmd which is in the repository. Some of the chunks will not be relevant as you have already-normalized NPQ values. But some of it might be useful if you are able to transform your data into the right format (sample boxplot, clustering, heatmaps). We will be adding another example script soon (in the next few days), which starts with a long csv NPQ file. I will let you know when that is uploaded. Joanne

On Mon, Mar 25, 2024 at 2:09 AM NiKo LiChi @.***> wrote:

Thanks for this, Joanne.

I hope you can find the files soon.

The data I received is in CSV format. However, I want to see which are the steps done in the Rmarkdown with the test file detectability_P1_Tr03.xm so I can reproduce them, and extract more info from the data.

Since the Rmarkdown is the only piece of help on this package, having it should be useful!

— Reply to this email directly, view it on GitHub https://github.com/Alamar-Biosciences/NULISAseqR/issues/148#issuecomment-2017521730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2EB2A5CD54JTCGARROIYLYZ7SSZAVCNFSM6AAAAABFEQQFISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJXGUZDCNZTGA . You are receiving this because you commented.Message ID: @.***>

NikoLichi commented 5 months ago

Hi Joanne, Thanks for the reply. Great news about the CSV!

As mentioned above and why I started this issue, I have access to the Rmarkdown document, but it requires another file with the actual data detectability_P1_Tr03.xml, which is not inside the R package.

Could you please upload this file?

If you upload it, I can see which is the file format needed for the package using the skeleton.Rmd file.

Thanks again and all the best, Nicolas

jcbeer commented 5 months ago

I see, ok sorry I misunderstood. Let me look into adding those as well and get back to you. Joanne

On Wed, Mar 27, 2024 at 10:13 AM NiKo LiChi @.***> wrote:

Hi Joanne, Thanks for the reply. Great news about the CSV!

As mentioned above and why I started this issue, I have access to the Rmarkdown document, but it requires another file with the actual data detectability_P1_Tr03.xml, which is not inside the R package.

Could you please upload this file?

If you upload it, I can see which is the file format needed for the package using the skeleton.Rmd file.

Thanks again and all the best, Nicolas

— Reply to this email directly, view it on GitHub https://github.com/Alamar-Biosciences/NULISAseqR/issues/148#issuecomment-2023343184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2EB2HOQ2WDAUA5NPB62P3Y2LV4BAVCNFSM6AAAAABFEQQFISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRTGM2DGMJYGQ . You are receiving this because you commented.Message ID: @.***>

jcbeer commented 5 months ago

detectability_P1_Tr03.txt detectability_P2_Tr03.txt Hi Nicolas, I won't have a chance to upload them to the repository itself until later, so please see attached files. If you change the extension from txt to xml, they should hopefully work to generate the report. Let me know if you have any issues. -- Joanne

jcbeer commented 5 months ago

Hi Nicolas, I am attaching here a csv file and R analysis script. You will again have to change the extensions, to "Alamar_NULISAseq_Detectability_NPQ.csv" and "data_analysis_demo.R". We will be updating the package itself with these files soon. Please let me know if you have any issues or further questions, and feedback regarding the package is always appreciated. Thanks

R script: data_analysis_demo.txt

CSV data file: Alamar_NULISAseq_Detectability_NPQ.txt

NikoLichi commented 5 months ago

Hi Joanne,

Thanks a lot for these scripts and the test data sets! I will give you my feedback on this asap, on the lates on Tuesday.

Really thanks a lot! All the best, Nicolas

NikoLichi commented 5 months ago

detectability_P1_Tr03.txt detectability_P2_Tr03.txt Hi Nicolas, I won't have a chance to upload them to the repository itself until later, so please see attached files. If you change the extension from txt to xml, they should hopefully work to generate the report. Let me know if you have any issues. -- Joanne

Hi Joanne, I modified the file names as you told me and there are still some complains in R.

  1. If I want to just load the data as described in the entry page Demo: Loading Data, the command loadNULISAseq said it needs SC, IPC and IC arguments. I added them in the load command like SC = NULL,IPC = NULL, IC = "mCherry" and the data was loaded.

  2. With the Rmarkdown was different. It is showing me the issue I reported in the first message:


Error in yaml::yaml.load(..., eval.expr = TRUE) : 
  Scanner error: while scanning a simple key at line 71, column 1 could not find expected ':' at line 72, column 3

Any way to solve the Rmarkdown file and any word on if loading the XML file is correct?

NikoLichi commented 5 months ago

Hi Nicolas, I am attaching here a csv file and R analysis script. You will again have to change the extensions, to "Alamar_NULISAseq_Detectability_NPQ.csv" and "data_analysis_demo.R". We will be updating the package itself with these files soon. Please let me know if you have any issues or further questions, and feedback regarding the package is always appreciated. Thanks

R script: data_analysis_demo.txt

CSV data file: Alamar_NULISAseq_Detectability_NPQ.txt

This worked very good! Thanks!!

I only have one question and one error (that I solved), but I don't know if this is a good way of solving it.

The error: When running the lmNULISAseq test, there is an object plasma_sample_list that should have been created before, but it was not. I replaced with rownames(sample_annotation). Would this make sense?

The question: About the method to do the differential expression: Other scientists recommended I use limma/voom to do this. Does the linear model used in lmNULISAseq have better results, or which method should be better?

Thanks again, Nicolas

jcbeer commented 5 months ago

Glad it worked!

This script was modified a bit from an earlier version that used a different dataset. I imagine plasma_sample_list was probably subsetting the relevant plasma samples columns for the comparison that was done. For the current dataset you have, you just need to remove the control samples (SampleType == "pooled_plasma") and use SampleType == "plasma", which is already done at line 47. So I think you can delete all the plasma_sample_list. Replacing with rownames(sample_annotation) is fine but not necessary, as the names already match (matching the ordering of the sample metadata and the protein expression data is taken care of within the function).

I think it is probably fine to use limma/voom, and we might add that functionality in the future. We haven't yet done a comprehensive comparison of the lm() vs limma/voom for the NULISAseq NPQ, so it is hard to say which approach might be better. I think the lm() approach is probably perfectly valid in most cases, but it's plausible that limma/voom could potentially improve the statistical power and stability of results in small sample sizes / low detectability targets through its borrowing information across the targets. In developing this package, we preferred to start with a basic simple linear model approach and avoid too much reliance on external packages. But please do let me know if you decide to try the limma/voom and have any insights on that.

Message ID: @.***>

NikoLichi commented 5 months ago

Thanks again for the comments, Joanne!

Yes, I see the redundancy in the code, but it's helpful when subsetting samples.

However, I have a comment on the PCA process. While scaling and centering (Z-score) are preferred for the Heatmaps, scaling of the data should not be considered for the PCA (but centering is fine). First, as far as I understood, the NULISAseq data is already log2 scaled. And second, scaling is only desired when the sampled variables are in different scales, which is also not the case for NULISAseq. Unless you have any special insights on why this extra scaling is needed, I would recommend changing that part of the code.

I'll try limma/voom and compare the lm() results in my data in the following days. I'll keep you updated.

jcbeer commented 5 months ago

Hi Nicolas, Sure, happy to help. Thanks for the comment on PCA. I see your point. I believe the scaling step prevents any subset of high variance targets from dominating the PCs, and rather weights all the targets equally. However I do see your point that they are all on NPQ scale and maybe we wouldn't want to equalize the weight of the targets for PCA. I will consider it further. Great, let me know if you find anything interesting! -- Joanne

On Wed, Apr 3, 2024 at 6:16 AM NiKo LiChi @.***> wrote:

Thanks again for the comments, Joanne!

Yes, I see the redundancy in the code, but it's helpful when subsetting samples.

However, I have a comment on the PCA process. While scaling and centering (Z-score) are preferred for the Heatmaps, scaling of the data should not be considered for the PCA (but centering is fine). First, as far as I understood, the NULISAseq data is already log2 scaled. And second, scaling is only desired when the sampled variables are in different scales, which is also not the case for NULISAseq. Unless you have any special insights on why this extra scaling is needed, I would recommend changing that part of the code.

I'll try limma/voom and compare the lm() results in my data in the following days. I'll keep you updated.

— Reply to this email directly, view it on GitHub https://github.com/Alamar-Biosciences/NULISAseqR/issues/148#issuecomment-2034586267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2EB2D6VAQ4C24YB6VKEILY3P6KPAVCNFSM6AAAAABFEQQFISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZUGU4DMMRWG4 . You are receiving this because you commented.Message ID: @.***>