Closed petermr closed 4 years ago
OK sir.
Sir, check for the instruments20191006.tsv
Column description is as follows.
INSTRUMENTS - cleaned names of instruments used into GC-MS analysis.
INSTRUMENTS_NORMALIZED - normalised list of instruments used into GC-MS analysis.
Total count of unique records - 95
.
Good start. There are some misspellings . Are these in the paper or did you mistype them? If they are in the paper that's a good indicator of author errors
On Sun, 6 Oct 2019, 14:06 Ambarish Kumar, notifications@github.com wrote:
Sir, check for the instruments20191006.tsv https://github.com/petermr/CEVOpen/blob/master/dictionary/instrument/instruments20191006.tsv
Column description is as follows.
-
INSTRUMENTS - cleaned names of instruments used into GC-MS analysis.
INSTRUMENTS_NORMALIZED - normalised list of instruments used into GC-MS analysis.
Total count of unique records - 95. Please check for the sheet and suggest changes before making dictionary
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/29?email_source=notifications&email_token=AAFTCSY7L6NNPTLIF2G4B4LQNHPHDA5CNFSM4I5PRCHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAOJVAA#issuecomment-538745472, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSY7TL674TVYBPQAS33QNHPHDANCNFSM4I5PRCHA .
Sir, I extracted the text snippet as it is present in the article. There is less chance to be misspelled by me. It is more likely that author has put the name as of extracted onto the sheet .
e.g
at line 34 - INSTRUMENTS column
`Aligent 6890`
Few are duplicates. e.g
line 7
`Agilent 6890 (GC) and Agilent 5973 (MSD)`
and line 10
`Agilent 6890 (GC) and Agilent 5973 (MS)`
@ambarishK thanks. Instruments have qualifiers:
Agilent 7890 (GC) Agilent 7890 (GC)
Agilent 7890 (GC) and Agilent 5975 (MSD) Agilent 7890 (GC) and Agilent 5975 (MSD)
Agilent 7890 (GC) and Agilent 5975 (MS)
Agilent 7890 N (GC) and Triple Quad 7000 A model mass detector Agilent 7890 N (GC) and Triple Quad 7000 A model mass detector
Agilent 7890A
Agilent 7890A (GC) Agilent 7890A (GC)
We would need a GC-MS expert to tell us whether the letters (A, N) are significant. For the moment I suggest we use them as separate entries
Agilent 7890
Agilent 7890 N
Agilent 7890N
Later we can use a regex to deal with whitespaces.
I will create a first pass dictionary.
OK sir.
create a list of instruments used in analysing (but NOT extracting) Essential Oils. This can be used as ground truth for Tiago's extraction sub-project.
Should find this in: "materials and methods"
create a new column for GC-MS currently just extract "HP6890" (GC) and "HP 5973" (MS)
extract "Shimadzu QP-5000 GC-MS"