biodavidjm / artMS

Analytical R Tools for Mass Spectrometry
GNU General Public License v3.0
14 stars 7 forks source link

question about user modification #189

Open hemingwang opened 3 years ago

hemingwang commented 3 years ago

Hi, I want to analysis the Methylation in my data. So I set the user defined PTM

in my config file, I wrote:

data: enabled: 1 silac: enabled: 0 filters: enabled: 1 contaminants: 1 protein_groups: remove modifications: PTM:KR:methyl

But I met the trouble:

Error in .artms_filterData(x = x, config = config, verbose = verbose) : The config > data > filters > modification PTM:KR:METHYL is not valid option

hemingwang commented 3 years ago

Hi David, could you please to help me ?

biodavidjm commented 3 years ago

Hi there,

Remember that you first need to do the protein to site conversion of the evidence file (please check guidelines) Basically, check the column Modified sequence of the evidence.txt file. If you selected methylation when searching for modifications with MaxQuant (if you did not enrich and search for methylation you don't have methylation in the modified peptide column), how are the peptides annotated? For example, for phosphorylation and acetylation would be something like this:

`_AAGGAPS(ph)PPPPVR_` -> phosphorylation
`_AAASK(gl)LGEFAK_` -> ubiquitination

therefore, you would have to convert those like this:

# Phosphopeptide in evidence file: `_AAGGAPS(ph)PPPPVR_`
artmsProtein2SiteConversion(
  evidence_file = "/path/to/the/evidence.txt", 
  ref_proteome_file = "/path/to/the/reference_proteome.fasta", 
  output_file = "/path/to/the/output/ac-sites-evidence.txt", 
  mod_type = "PTM:STY:(ph)")

# Ubiquitinated peptide: `_AAASK(gl)LGEFAK_`
artmsProtein2SiteConversion(
  evidence_file = "/path/to/the/evidence.txt", 
  ref_proteome_file = "/path/to/the/reference_proteome.fasta", 
  output_file = "/path/to/the/output/ac-sites-evidence.txt", 
  mod_type = "PTM:K:(gl)")

Again, please, check the vignette/guidelines section PTM-Site/Peptide-specific Quantification of Changes (PH, UB, AC)

hemingwang commented 3 years ago

I follow your guild but I get the same wrong. I search your source code about MSstats_functions.R(https://github.com/biodavidjm/artMS/blob/master/R/MSstats_functions.R) In # DEAL WITH OLD CONFIGURATION FILES WHEN config$data$filters$modification part There only NULL, AB, UB, PH, APMS, I didnot find message about the user defined modification. Is that the cause of the problem?

biodavidjm commented 3 years ago

Yes, please, install the latest version. And make sure you use the latest version of the artms config file as well. And please, copy and paste here a methylated modified peptide. Thanks

hemingwang commented 3 years ago

my modified peptide is methyl AAAAAAALQAK(methyl)SDEK

biodavidjm commented 3 years ago

So...

mod_type = "PTM:KR:(methyl)"

should work with the latest artMS version.

hemingwang commented 3 years ago

It's the same mistake. But in the mistake, the modification is show Upper case other than lower case. Today I install the newest artMS from the github

Error in .artms_filterData(x = x, config = config, verbose = verbose) : The config > data > filters > modification PTM:KR:(METHYL) is not valid option

But I run the artmsProtein2SiteConversion is OK. But when I write it in the config file, it will report the error

biodavidjm commented 3 years ago

Sorry, not sure I follow what your problem is now. I advise you to run the commands and paste the entire output, not just the error, so we can debug it. For example, I still see PTM:KR:(METHYL) and you reported that mod_type = "PTM:KR:(methyl)" worked. So not sure what the problem is.

hemingwang commented 3 years ago

My R console result is:

> # Convert 'Leading razor protein' evidence's file column
> artmsProtein2SiteConversion(
+   evidence_file = "data_methyl/evidence.txt", # ORIGINAL
+   column_name = "Leading razor protein",
+   ref_proteome_file = "data_methyl/human.fasta", 
+   output_file = "methyl-sites-evidence.txt", # SITES VERSION
+   mod_type = "PTM:KR:methyl")
-------------------------------------------------------------
artMS: Annotate Protein ID column to ProteinID_Site notation 
-------------------------------------------------------------
>> FILTERING MODIFIED PEPTIDES
--- SELECTING PEPTIDES MODIFIED WITH USER DEFINED PTM, with this regex: (KR)\(methyl\)
--- OPENING EVIDENCE FILE 
---(+) 'Modified sequence' column available
--- Long notation detected in <Modified sequence> column: converting to short notation
--- READING REFERENCE PROTEOME 
--- EXTRACTING PTM POSITIONS FROM THE MODIFIED PEPTIDES 
   (This might take a long time depending on the size of the fasta/evidence file) 
- Protein <Q6KHK5> not in the sequence database
- Protein <B3PNH3> not in the sequence database
- Protein <P47698> not in the sequence database
--- ALL SEQUENCES MAPPED 
>> FILES OUT:
--- New evidence-site file: methyl-sites-evidence.txt
--- Details of the Mappings: methyl-sites-evidence-mapping.txt
>> CONVERSION COMPLETED
Warning message:
In readLines(file) :
  incomplete final line found on 'data_methyl/human.fasta'
> # Convert 'Proteins' evidence's file column
> artmsProtein2SiteConversion(
+   evidence_file = "methyl-sites-evidence.txt", # <- USE SITES VERSION
+   column_name = "Proteins",
+   overwrite_evidence = TRUE, # <--- TURN ON
+   ref_proteome_file = "data_methyl/human.fasta", 
+   output_file = "methyl-sites-evidence.txt", # <- SITES VERSION
+   mod_type = "PTM:KR:methyl")
-------------------------------------------------------------
artMS: Annotate Protein ID column to ProteinID_Site notation 
-------------------------------------------------------------
>> FILTERING MODIFIED PEPTIDES
--- SELECTING PEPTIDES MODIFIED WITH USER DEFINED PTM, with this regex: (KR)\(methyl\)
--- OPENING EVIDENCE FILE 
---(+) 'Modified sequence' column available
--- READING REFERENCE PROTEOME 
--- EXTRACTING PTM POSITIONS FROM THE MODIFIED PEPTIDES 
   (This might take a long time depending on the size of the fasta/evidence file) 
- Protein <Q6MS92> not in the sequence database
- Protein <Q2ST36> not in the sequence database
- Protein <Q4A604> not in the sequence database
- Protein <Q6KI82> not in the sequence database
- Protein <P33253> not in the sequence database
- Protein <Q98QU5> not in the sequence database
- Protein <P47639> not in the sequence database
- Protein <Q601Z5> not in the sequence database
- Protein <Q4AAV7> not in the sequence database
- Protein <Q4A8V9> not in the sequence database
- Protein <Q8EWY8> not in the sequence database
- Protein <Q50331> not in the sequence database
--- ALL SEQUENCES MAPPED 
>> FILES OUT:
--- New evidence-site file: methyl-sites-evidence.txt
--- Details of the Mappings: methyl-sites-evidence-mapping.txt
>> CONVERSION COMPLETED
Warning message:
In readLines(file) :
  incomplete final line found on 'data_methyl/human.fasta'
> artmsQuantification(yaml_config_file = "methyl_config.yaml",display_msstats = T)
--------------------------------------------
artMS: Relative Quantification using MSstats
--------------------------------------------
>> Reading the configuration file
>> LOADING DATA 
>> MERGING FILES 
>> CONVERT Intensity values < 1 to NA
>> FILTERING 
-- Contaminants CON__|REV__ removed
-- Removing protein groups
-- Use <Leading.razor.protein> as Protein ID
Error in .artms_filterData(x = x, config = config, verbose = verbose) : 
  The config > data > filters > modification PTM:K:methyl is not valid option****

And my config file is

files:
  evidence: methyl-sites-evidence.txt
  keys: data_methyl/keys.txt
  contrasts: data_methyl/contrast.txt
  output: res/results.txt
qc:
  basic: 0
  extended: 0
  extendedSummary: 0
data:
  enabled: 1
  silac:
    enabled: 0
  filters:
    enabled: 1
    contaminants: 1
    protein_groups: remove
    modifications: PTM:KR:methyl
  sample_plots: 1
msstats:
  enabled: 1
  msstats_input: ~
  profilePlots: none
  normalization_method: equalizeMedians
  normalization_reference: ~
  summaryMethod: TMP
  MBimpute: 1
  censoredInt: NA
  feature_subset: all
  n_top_feature: 3
  logTrans: 2
  remove_uninformative_feature_outlier: no
  min_feature_count: 2
  equalFeatureVar: yes
  remove50missing: no
  fix_missing: ~
  maxQuantileforCensored: 0.999
  use_log_file: no
  append: no
  log_file_path: ~
output_extras:
  enabled: 1
  annotate:
    enabled: 1
    species: HUMAN
  plots:
    volcano: 1
    heatmap: 1
    LFC: -0.58 0.58
    FDR: 0.05
    heatmap_cluster_cols: 0
    heatmap_display: log2FC
biodavidjm commented 3 years ago

Thanks, it is clear now. Please, give us some time to fix this issue.

hemingwang commented 3 years ago

Thank you for your patience to help me! I am looking forward to your reply!

hemingwang commented 3 years ago

Thanks a lot for your favor. How's this issue going? Need I provide more information?

biodavidjm commented 3 years ago

Thanks, we have all what is needed. It will take several days to get it done. Thanks for the patient (I will report here when ready)

biodavidjm commented 3 years ago

Hi @hemingwang. I have a temporal solution that should work for you.

Keep the exact same configuration file, but instead of

modifications: PTM:KR:methyl

use

modifications: AB

Since you are performing a site-specific analysis and providing as evidence the methyl-sites-evidence.txt file, the modified peptides are already filtered. The modification option would be required for a "global methylation" analysis, but for a site-specific, the evidence file is already filtered. The next release of artMS would address this issue

Please, let me know how it goes.