Closed cvanderaa closed 3 years ago
We should simplify and curate the rowData of the specht2019v2 dataset. These are the current fields:
rowData
specht2019v2
Sequence
Length
Modifications
Modified.sequence
Deamidation..N..Probabilities
Oxidation..M..Probabilities
Deamidation..N..Score.Diffs
Oxidation..M..Score.Diffs
Acetyl..Protein.N.term.
Deamidation..N.
Oxidation..M.
Missed.cleavages
Proteins
Leading.proteins
protein
Gene.names
Protein.names
Type
Set
MS.MS.m.z
Charge
m.z
Mass
Resolution
Uncalibrated...Calibrated.m.z..ppm.
Uncalibrated...Calibrated.m.z..Da.
Mass.error..ppm.
Mass.error..Da.
Uncalibrated.mass.error..ppm.
Uncalibrated.mass.error..Da.
Max.intensity.m.z.0
Retention.time
Retention.length
Calibrated.retention.time
Calibrated.retention.time.start
Calibrated.retention.time.finish
Retention.time.calibration
Match.time.difference
Match.m.z.difference
Match.q.value
Match.score
Number.of.data.points
Number.of.scans
Number.of.isotopic.peaks
PIF
Fraction.of.total.spectrum
Base.peak.fraction
PEP
MS.MS.count
MS.MS.scan.number
Score
Delta.score
Combinatorics
Intensity
Reporter.intensity.corrected.0
Reporter.intensity.corrected.1
Reporter.intensity.corrected.2
Reporter.intensity.corrected.3
Reporter.intensity.corrected.4
Reporter.intensity.corrected.5
Reporter.intensity.corrected.6
Reporter.intensity.corrected.7
Reporter.intensity.corrected.8
Reporter.intensity.corrected.9
Reporter.intensity.corrected.10
Reporter.intensity.count.0
Reporter.intensity.count.1
Reporter.intensity.count.2
Reporter.intensity.count.3
Reporter.intensity.count.4
Reporter.intensity.count.5
Reporter.intensity.count.6
Reporter.intensity.count.7
Reporter.intensity.count.8
Reporter.intensity.count.9
Reporter.intensity.count.10
Reporter.PIF
Reporter.fraction
Reverse
Potential.contaminant
id
Protein.group.IDs
Peptide.ID
Mod..peptide.ID
MS.MS.IDs
Best.MS.MS
AIF.MS.MS.IDs
Deamidation..N..site.IDs
Oxidation..M..site.IDs
remove
dart_PEP
dart_qval
razor_protein_fdr
Deamidation..NQ..Probabilities
Deamidation..NQ..Score.Diffs
Deamidation..NQ.
Reporter.intensity.corrected.11
Reporter.intensity.corrected.12
Reporter.intensity.corrected.13
Reporter.intensity.corrected.14
Reporter.intensity.corrected.15
Reporter.intensity.corrected.16
Reporter.intensity.count.11
Reporter.intensity.count.12
Reporter.intensity.count.13
Reporter.intensity.count.14
Reporter.intensity.count.15
Reporter.intensity.count.16
Deamidation..NQ..site.IDs
input_id
rt_minus
rt_plus
mu
muij
sigmaij
pep_new
exp_id
peptide_id
stan_peptide_id
exclude
residual
participated
peptide
Description of most fields is given on the MaxQuant website.
The fields in bold are the ones I think must stay.
We should simplify and curate the
rowData
of thespecht2019v2
dataset. These are the current fields:Sequence
Length
Modifications
Modified.sequence
Deamidation..N..Probabilities
Oxidation..M..Probabilities
Deamidation..N..Score.Diffs
Oxidation..M..Score.Diffs
Acetyl..Protein.N.term.
Deamidation..N.
Oxidation..M.
Missed.cleavages
Proteins
: should be renamed to ProteinGroupLeading.proteins
protein
: should be renamed to ProteinGene.names
: names of all genes associated toProteins
, should be associated toprotein
insteadProtein.names
: names of all proteins associated toProteins
, should be associated toprotein
insteadType
Set
MS.MS.m.z
Charge
m.z
Mass
Resolution
Uncalibrated...Calibrated.m.z..ppm.
Uncalibrated...Calibrated.m.z..Da.
Mass.error..ppm.
Mass.error..Da.
Uncalibrated.mass.error..ppm.
Uncalibrated.mass.error..Da.
Max.intensity.m.z.0
Retention.time
Retention.length
Calibrated.retention.time
Calibrated.retention.time.start
Calibrated.retention.time.finish
Retention.time.calibration
Match.time.difference
Match.m.z.difference
Match.q.value
: all NA because no MBR was performedMatch.score
: all NA because no MBR was performedNumber.of.data.points
Number.of.scans
Number.of.isotopic.peaks
PIF
Fraction.of.total.spectrum
Base.peak.fraction
PEP
: deprecated after DART-ID updateMS.MS.count
MS.MS.scan.number
Score
Delta.score
Combinatorics
Intensity
Reporter.intensity.corrected.0
Reporter.intensity.corrected.1
Reporter.intensity.corrected.2
Reporter.intensity.corrected.3
Reporter.intensity.corrected.4
Reporter.intensity.corrected.5
Reporter.intensity.corrected.6
Reporter.intensity.corrected.7
Reporter.intensity.corrected.8
Reporter.intensity.corrected.9
Reporter.intensity.corrected.10
Reporter.intensity.count.0
Reporter.intensity.count.1
Reporter.intensity.count.2
Reporter.intensity.count.3
Reporter.intensity.count.4
Reporter.intensity.count.5
Reporter.intensity.count.6
Reporter.intensity.count.7
Reporter.intensity.count.8
Reporter.intensity.count.9
Reporter.intensity.count.10
Reporter.PIF
Reporter.fraction
Reverse
Potential.contaminant
id
Protein.group.IDs
Peptide.ID
Mod..peptide.ID
MS.MS.IDs
Best.MS.MS
AIF.MS.MS.IDs
Deamidation..N..site.IDs
Oxidation..M..site.IDs
remove
: generated by DART-IDdart_PEP
: generated by DART-ID, should be renamedPEP
dart_qval
: generated by DART-IDrazor_protein_fdr
: generated by DART-IDDeamidation..NQ..Probabilities
Deamidation..NQ..Score.Diffs
Deamidation..NQ.
Reporter.intensity.corrected.11
Reporter.intensity.corrected.12
Reporter.intensity.corrected.13
Reporter.intensity.corrected.14
Reporter.intensity.corrected.15
Reporter.intensity.corrected.16
Reporter.intensity.count.11
Reporter.intensity.count.12
Reporter.intensity.count.13
Reporter.intensity.count.14
Reporter.intensity.count.15
Reporter.intensity.count.16
Deamidation..NQ..site.IDs
input_id
: generated by DART-IDrt_minus
: generated by DART-IDrt_plus
: generated by DART-IDmu
: generated by DART-IDmuij
: generated by DART-IDsigmaij
: generated by DART-IDpep_new
: generated by DART-IDexp_id
: generated by DART-IDpeptide_id
: generated by DART-IDstan_peptide_id
: generated by DART-IDexclude
: generated by DART-IDresidual
: generated by DART-IDparticipated
: generated by DART-IDpeptide
: peptide sequence + chargeDescription of most fields is given on the MaxQuant website.
The fields in bold are the ones I think must stay.