Open EmanuelFaria opened 4 years ago
I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:
Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”
Copied duplicates of them within the following sub-directories: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized
I will provide further details as updates are made.
@petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?
Please have a look and provide feedback as to:
Please ### note:
Thank you.
On Fri, Jan 24, 2020 at 3:43 PM Emanuel Faria notifications@github.com wrote:
I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:
1.
Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”
Yes, there was no system in naming the files so there are almost certainly duplicates. Important to try to identify the latest one.
1. 2.
Copied duplicates of them within the following sub-directories:
- ACTIVITY
- CHEMICAL ANALYSIS CONSTITUENTS
- CHEMICAL ANALYSIS METHODS
- PLANT ORIGIN
- TARGET SPECIES
Looks appropriate.
Examined the contents of each file in each (now sorted) directory and — hopefully — picked the right ones to begin drafting DDDs for each — along with an “AboutOIL186Dictionaries.md” master description document — all of which can be found here: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186
- AboutOIL186Dictionaries.md
- ChemicalConstituentsDictionaryDescription.md
- CountryDictionaryDescription.md
- ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md
- TargetOrganismDictionaryDescription.md
I will provide further details as updates are made.
Thanks.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS24FR27KDDFPPX72ZDQ7MEANA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ3GCQQ#issuecomment-578183490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYEYBJDJLLFLRXMQQTQ7MEANANCNFSM4KLHT7WQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
On Fri, Jan 24, 2020 at 3:49 PM Emanuel Faria notifications@github.com wrote:
Clarification requested:
@petermr https://github.com/petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?
Note that "tree/master/" chunk is an artefact of Github and won't appear on your disk
B) is the production version, but you should check if there is an obviously larger or newer/cleaner version in A);
A dictionary has a ist of entries like:
Working on description document for compounds.xml (Draft of CompoundDictionaryDescription.md in the same folder now. It was made with the texts.app I told you about, @petermr ... look ok to you?).
What are the definitions for the following, please:
/desc /entry/@name /entry/@term
Is there information missing from this mail?
On Fri, Jan 24, 2020 at 11:18 PM Emanuel Faria notifications@github.com wrote:
What are the definitions for the following, please:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3ZVFVM577TVDHVOG3Q7NZLJA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4MF4Y#issuecomment-578339571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6OQOBACPO26NZBWNDQ7NZLJANCNFSM4KLHT7WQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
Whoops. Yes.... added links to files below.
Working on description document for compounds.xml For the draft of CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor I told you about, @petermr ... look ok to you?).
I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?
/entry/@name /entry/@term
The term is the precise string used to identify the concept. The name is a human readable string describing the concept .they are often the same.
On Sat, 25 Jan 2020, 15:58 Emanuel Faria, notifications@github.com wrote:
Whoops. Yes.... added links to files below.
Working on description document for compounds.xml https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/compound.xml For the draft of CompoundDictionaryDescription.md https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor http://www.texts.io/ I told you about, @petermr https://github.com/petermr ... look ok to you?).
I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?
/entry/@name https://github.com/name /entry/@term https://github.com/term
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3LZ3FKKVIODM4GD4LQ7ROTZA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ47JRA#issuecomment-578417860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSZNJQ6CWKCHXX253ODQ7ROTZANCNFSM4KLHT7WQ .
Thanks Peter. Compound Dictionary description is now ready for review. https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md
Interestingly, it contains a table of contents at the top of the page, which I did not create. Does github do this by default, or was it the WYSIWYG editor I'm using?
I've just posted drafts DictionaryDescriptions for the dictionary .xml files I could find.
Location of Main Description of Descriptions .md The main document that provides a description of all the DictionaryDiscriptions is AboutOIL186Dictionaries.md. From here, you can click on the name of any of the sub-sub-headings that end with .md to get to the individual DictionaryDescription for that topic.
Location of Individual Descriptions ### files Because the there were two sources of .xml files to work with (either in CEVOpen/tree/master/dictionary or CEVOpen/tree/master/articleAnalysis/oil186/raw) I have stored the individual DictionaryDescription .md files accordingly in:
(Remember: I created the directory /DictionaryDuplicateTablesOrganized and copied the existing files in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/ in order to better organize them for my work on creating these dictionaries.)
Heads up Currently, there are notes at the bottom of each of the individual dictionaries — things to fix, clean up, consider, decide, etc.. I will now begin coping the contents of each of them — including their notes — in into separate comment entries for discussion and instruction for correction.
EDIT: On second thought... I'll paste the contents of the master description of descriptions below, and begin new issues for the individual ones. It will be easier to manage the conversation about corrections that way.
Manny
This document contains information about the Manually Created Dictionaries for OIL186.
The purpose/function of Dictionaries:
Identify objects/concepts (eg. “e.coli" is a concept.).
Give each object clear lexical names by which they can be searched. (An object that goes by more than one name is a synonym)
Give each object a link to wikidata (or other authorities) by which we can learn more about them.
PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression from Plants -> Extracts -> Testing Methods and Instruments -> Results Analysis -> Activities -> Target Organisms the activities were tested upon -> Diseases related to those target organisms
Â
Layman and Botanical Names / Species
Â
Description: A dictionary of 1678 constituent chemical compounds extracted from Essential Oils mentioned in the 186 test articles downloaded from PubMed. Of the 1678 entries, ?????? had their names normalized and tagged with corresponding Wikidata IDs, the other 112 remain to be resolved.
Filename: OilPlant.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/plant/oilplant.xml
Â
The plant part or parts from which the mentioned oils are extracted
Â
Description: A dictionary of [XX] part(s) of a plant from which Essential Oils — mentioned in the 186 test articles downloaded from PubMed — were extracted.
Filename: plantParts20191014.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/plantparts/raw/plantParts20191014.xml
Â
The geographical origins of the harvested plant material
Â
Description: A dictionary of 46 countries of origin mentioned in the 186 source articles for plants being tested.
Filename: country20191222.tsv
File Location: https://github.com/petermr/CEVOpen/blob/master/articleAnalysis/oil186/raw/country20191222.tsv
Â
Â
Description: A dictionary of [XX] plant processes from which Essential Oils — mentioned in the 186 test articles downloaded from PubMed — were harvested.
Filename: process20191014.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/process/process20191014.xml
Â
Equipment, methods and materials used for EO extraction
Description: A dictionary of 6 Methods of Essential Oil extraction and 6 Types of Chemical Analysis, mentioned in the 186 source articles for plant extracts being tested.
Filename:Â methodAndAnalysisExtraction20191225.tsv
File Location:Â https://github.com/petermr/CEVOpen/blob/master/articleAnalysis/oil186/raw/methodAndAnalysisExtraction20191225.tsv
Â
A dictionary of [24] makes/models of Gas chromatography–mass spectrometry equipment used to identify different substances within a test sample — in this case, Essential Oils mentioned in the 186 test articles downloaded from PubMed.
Â
Description: A dictionary of [24] makes/models of Gas chromatography–mass spectrometry equipment used to identify different substances within a test sample — in this case, Essential Oils mentioned in the 186 test articles downloaded from PubMed.
Filename:Â instrument.xml
File Location:Â https://github.com/petermr/CEVOpen/blob/master/dictionary/instrument/raw/instrument.xml
Â
Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.
Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).
Â
Description: A dictionary of 2114 constituent chemical compounds extracted from Essential Oils mentioned in the 186 test articles downloaded from PubMed. Of the 2114 entries, 1010 had their names normalized and tagged with corresponding Wikidata IDs, the other 1104 remain to be resolved.
Filename: compound.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/compound.xml
Â
Â
Tested biochemical and/or biological activities, and where available, their measured results.
Description: A dictionary of 184 activities mentioned in the 186 test articles downloaded from PubMed.
Filename: activity.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/activity/activity.xml
Â
The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).
Â
Description: A dictionary of [55] organisms mentioned [as subjects of experiment?] in the 186 test articles downloaded from PubMed.
Filename:Â targetOrganism.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/targetOrganism/targetOrganism.xml
Â
Description: A dictionary of 133 microrganisms mentioned in tests + WikidataID + frequencies (the number of times the organisms occurred in the 186 source papers)
Filename:Â TargetOrganismCount.csv
File Location:Â https://github.com/petermr/CEVOpen/blob/master/articleAnalysis/oil186/raw/targetOrganismCount.csv
Â
Text for definitions goes here
This dictionary does not yet exist
FYI: As I clean up each [dictionary].xml file and update their unique [DictionaryName]DictionaryDescription.md files, I have also updated the master INDEX of Oil186 Dictionary Descriptions here: (INDEXofOIL186Dictionaries.md)
As of today, we have 11 finished dictionaries. They are:
This index contains information about the Manually Created Dictionaries for OIL186.
PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression.
The purpose/function of Dictionaries:
Identify “things” as objects or concepts (eg. “e.coli" is a concept.).
Give each object clear lexical names by which they can be searched.
(An object that goes by more than one name is a synonym.)
Give each object a link to wikidata (or other authorities) by which we can learn more about them.
Â
Description: A dictionary of 1678 plant names extracted mentioned in the 186 test articles downloaded from PubMed. Of the 1678 entries, 1567 had their names normalized and tagged with corresponding Wikidata IDs.
Filename: eoPlant.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoPlant/eoPlant.xml
Â
Description: A dictionary of 285 plant part terms.
Filename: eoPlantPart.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoPlantPart/eoPlantPart.xml
Â
Description: A dictionary of 9568 entries for geolocations including country, countryISOcode, city, latitude, longitude, postal code and time zone sourced from http://www.ip2location.com, along with data agumenting Indian States-Cities created and maintained over the years obtained at https://network.convergenceservices.in/forum/12-joomla-development/4305-mysql-tables-for-country-states-and-indian-states-cities.html.
License information: This site or product includes IP2Location LITE data available from http://www.ip2location.com
Filename: geoLocation.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/geoLocation/geoLocation.xml
Â
Description: A dictionary of 81 entries relating to the plant material history leading up to the extraction of Essential Oils mentioned in selected literature chosen from the 186 test articles downloaded from PubMed. The entries include key words and phrases describing: growth conditions, plant life stages, plant material selection, post-harvest treatment methods, and extracted plant material products. Of the 82 entries, 58 were resolved to WikidataIDs.
Filename: eoPlantMaterialHistory.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoPlantMaterialHistory/eoPlantMaterialHistory.xml
Â
Description: A dictionary of 87 terms for Essential Oil extraction methods and apparatus.
Filename: eoExtractionMethod.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoExtractionMethod/eoExtractionMethod.xml
Â
Analytical chemistry studies and uses instruments and methods used to separate, identify, and quantify matter.[1] In practice, separation, identification or quantification may constitute the entire analysis or be combined with another method. Separation isolates analytes. Qualitative analysis identifies analytes, while quantitative analysis determines the numerical amount or concentration.
Analytical chemistry consists of classical, wet chemical methods and modern, instrumental methods.[2] Classical qualitative methods use separations such as precipitation, extraction, and distillation. Identification may be based on differences in color, odor, melting point, boiling point, radioactivity or reactivity. Classical quantitative analysis uses mass or volume changes to quantify amount. Instrumental methods may be used to separate samples using chromatography, electrophoresis or field flow fractionation. Then qualitative and quantitative analysis can be performed, often with the same instrument and may use light interaction, heat interaction, electric fields or magnetic fields. Often the same instrument can separate, identify and quantify an analyte.
(Source: https://en.wikipedia.org/wiki/Analytical_chemistry)
Â
Description: A dictionary of 117 entries describing instruments and methods used to separate, identify, and quantify matter — 105 being resolved to wikidata IDs, and 95 with short descriptions.
Filename: eoAnalysisMethod.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoAnalysisMethod/eoAnalysisMethod.xml
Â
Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.
Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).
Â
Description: A dictionary of 2114 constituent chemical compounds extracted from Essential Oils converted from essoldb1.0 data. Of the 2114 entries, 1010 had their names normalized and tagged with corresponding Wikidata IDs, the other 1104 remain to be resolved as no Wikidata IDs currently exist for them.
Filename: eoCompound.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoCompound/eoCompound.xml
Â
Description: A dictionary of 438 essential oil or constituent compound biochemical and/or biological activities, 340 of which resolved to wikidata IDs, and 336 with descriptions of 250 characters or less.
Filename: eoActivity.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoActivity/eoActivity.xml
Â
The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).
Â
Description: A dictionary of terms describing 307 target organisms resolved to wikidataIDs (including genus and species of bacteria, fungi, protist, protozoa, and other microorgnisms), with 154 terms including names of related diseases.
Filename: eoTargetOrganism.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoTargetOrganism/eoTargetOrganism.xml
Â
Description: A dictionary of 3412 terms related to human diseases.
Filename: humanDiseases.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/humanDiseases/humanDiseases.xml
Â
Description: A dictionary of 1032 terms for two categories of insects: A) Insect vectors of human pathogens sourced from https://en.wikipedia.org/wiki/Category:Insect_vectors_of_human_pathogens, and B) Winged insects soursed from https://www.insectidentification.org/winged-insect-key.asp
Filename: pests.xml
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/pests/pests.xml
Here we describe the process of: