petermr / CEVOpen

Contentmining of Open phytochemical literature for medicinal activities
26 stars 19 forks source link

đź“š Documentation: MASTER INDEX of Dictionary Descriptions for Oil186 test batch #74

Open EmanuelFaria opened 4 years ago

EmanuelFaria commented 4 years ago

Here we describe the process of:

  1. creating a master INDEX (INDEXofOIL186Dictionaries.md)of [DictionaryName]DictionaryDescription.md documents, which will describe the contents of the individual dictionaries created to date for data collected for Oil186,
  2. creating individual "DictionaryDescription" documents for each dictionary — which will each have their own Github Issue number, to facilitate discussion and correction.
EmanuelFaria commented 4 years ago

I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:

  1. Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”

  2. Copied duplicates of them within the following sub-directories: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized

  1. Examined the contents of each file in each (now sorted) directory and — hopefully — picked the right ones to begin drafting DDDs for each — along with an “AboutOIL186Dictionaries.md” master description document — all of which can be found here: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186

I will provide further details as updates are made.

EmanuelFaria commented 4 years ago

Clarification requested:

@petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?

Direction requested:

Please have a look and provide feedback as to:

Please ### note:

Thank you.

petermr commented 4 years ago

On Fri, Jan 24, 2020 at 3:43 PM Emanuel Faria notifications@github.com wrote:

I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:

1.

Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”

Yes, there was no system in naming the files so there are almost certainly duplicates. Important to try to identify the latest one.

1. 2.

Copied duplicates of them within the following sub-directories:

https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized

  • ACTIVITY
  • CHEMICAL ANALYSIS CONSTITUENTS
  • CHEMICAL ANALYSIS METHODS
  • PLANT ORIGIN
  • TARGET SPECIES

Looks appropriate.

  1. Examined the contents of each file in each (now sorted) directory and — hopefully — picked the right ones to begin drafting DDDs for each — along with an “AboutOIL186Dictionaries.md” master description document — all of which can be found here: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186

    • AboutOIL186Dictionaries.md
    • ChemicalConstituentsDictionaryDescription.md
    • CountryDictionaryDescription.md
    • ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md
    • TargetOrganismDictionaryDescription.md

I will provide further details as updates are made.

Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS24FR27KDDFPPX72ZDQ7MEANA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ3GCQQ#issuecomment-578183490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYEYBJDJLLFLRXMQQTQ7MEANANCNFSM4KLHT7WQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr commented 4 years ago

On Fri, Jan 24, 2020 at 3:49 PM Emanuel Faria notifications@github.com wrote:

Clarification requested:

@petermr https://github.com/petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?

Note that "tree/master/" chunk is an artefact of Github and won't appear on your disk

B) is the production version, but you should check if there is an obviously larger or newer/cleaner version in A);

A dictionary has a ist of entries like:

... each entry MUST have a term and SHOULD have a wikidata ID. It MAY have a name (which is often the same as the term, but not always). Ideally they should all have IDs. The description is normally the Wikidata description *Direction requested:* > > Please have a look and provide feedback as to: > > - Have I chosen the correct tables to describe? If not, please point > me to the right ones. (eg. the Chemical Constituents file I chose was the > only one with wikidataIDs, but had very few entries). > > > The dictionaries should end up in https://github.com/petermr/CEVOpen/[tree/master/]dictionary > > - > - What you want me to name the files > > for the dictionary the name of title in the file , e.g. CEVOpen /dictionary /targetOrganism / *targetOrganism.xml* starts ... The "targetOrganism" is the name of the file (+.xml) and also the title of the dictionary. If they are different the software wi;; throw an error. converted from essoldb1.0 > - Where you want them to be posted > - Would you like any changes to the formatting? > > Please ### note: > > - The source files for the descriptions are in the .md documents > > Please put the links in the issue so I can go straight there... > - I have pasted questions for you at the bottom of some of them. > > Thank you. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > -- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
EmanuelFaria commented 4 years ago

Working on description document for compounds.xml (Draft of CompoundDictionaryDescription.md in the same folder now. It was made with the texts.app I told you about, @petermr ... look ok to you?).

What are the definitions for the following, please:

/desc /entry/@name /entry/@term

petermr commented 4 years ago

Is there information missing from this mail?

On Fri, Jan 24, 2020 at 11:18 PM Emanuel Faria notifications@github.com wrote:

What are the definitions for the following, please:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3ZVFVM577TVDHVOG3Q7NZLJA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4MF4Y#issuecomment-578339571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6OQOBACPO26NZBWNDQ7NZLJANCNFSM4KLHT7WQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

EmanuelFaria commented 4 years ago

Whoops. Yes.... added links to files below.

Working on description document for compounds.xml For the draft of CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor I told you about, @petermr ... look ok to you?).

I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?

/entry/@name /entry/@term

petermr commented 4 years ago

The term is the precise string used to identify the concept. The name is a human readable string describing the concept .they are often the same.

On Sat, 25 Jan 2020, 15:58 Emanuel Faria, notifications@github.com wrote:

Whoops. Yes.... added links to files below.

Working on description document for compounds.xml https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/compound.xml For the draft of CompoundDictionaryDescription.md https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor http://www.texts.io/ I told you about, @petermr https://github.com/petermr ... look ok to you?).

I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?

/entry/@name https://github.com/name /entry/@term https://github.com/term

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3LZ3FKKVIODM4GD4LQ7ROTZA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ47JRA#issuecomment-578417860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSZNJQ6CWKCHXX253ODQ7ROTZANCNFSM4KLHT7WQ .

EmanuelFaria commented 4 years ago

Thanks Peter. Compound Dictionary description is now ready for review. https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md

Interestingly, it contains a table of contents at the top of the page, which I did not create. Does github do this by default, or was it the WYSIWYG editor I'm using?

EmanuelFaria commented 4 years ago

I've just posted drafts DictionaryDescriptions for the dictionary .xml files I could find.

Location of Main Description of Descriptions .md The main document that provides a description of all the DictionaryDiscriptions is AboutOIL186Dictionaries.md. From here, you can click on the name of any of the sub-sub-headings that end with .md to get to the individual DictionaryDescription for that topic.

Location of Individual Descriptions ### files Because the there were two sources of .xml files to work with (either in CEVOpen/tree/master/dictionary or CEVOpen/tree/master/articleAnalysis/oil186/raw) I have stored the individual DictionaryDescription .md files accordingly in:

(Remember: I created the directory /DictionaryDuplicateTablesOrganized and copied the existing files in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/ in order to better organize them for my work on creating these dictionaries.)

Heads up Currently, there are notes at the bottom of each of the individual dictionaries — things to fix, clean up, consider, decide, etc.. I will now begin coping the contents of each of them — including their notes — in into separate comment entries for discussion and instruction for correction.

EDIT: On second thought... I'll paste the contents of the master description of descriptions below, and begin new issues for the individual ones. It will be easier to manage the conversation about corrections that way.

Manny

EmanuelFaria commented 4 years ago

Index of​ the OIL186 Dictionaries

This document contains information about the Manually Created Dictionaries for OIL186.

The purpose/function of Dictionaries:

  1. Identify objects/concepts (eg. “e.coli" is a concept.).

  2. Give each object clear lexical names by which they can be searched. (An object that goes by more than one name is a synonym)

  3. Give each object a link to wikidata (or other authorities) by which we can learn more about them.

PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression from Plants -> Extracts -> Testing Methods and Instruments -> Results Analysis -> Activities -> Target Organisms the activities were tested upon -> Diseases related to those target organisms

 

Plants

Layman and Botanical Names / Species

 

OilPlantDictionaryDescription.md

 

Plant Parts

The plant part or parts from which the mentioned oils are extracted

 

PlantPartsDictionaryDescription.md

 

Locations​

The geographical origins of the harvested plant material

 

PlantOriginDescription.md

 

Plant Material History

 

ProcessDictionaryDescription.md

 

EO Extraction and Chemical Analysis Methods

Equipment, methods and materials used for EO extraction

ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md

 

EO Analysis Instruments

A dictionary of [24] makes/models of Gas chromatography–mass spectrometry equipment used to identify different substances within a test sample — in this case, Essential Oils mentioned in the 186 test articles downloaded from PubMed.

 

InstrumentDictionaryDescription.md

 

EO Chemical Analysis Results - Constituents and Concentrations

Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.

Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).

 

CompoundDictionaryDescription.md

 

 

EO Activities

Tested biochemical and/or biological activities, and where available, their measured results.

 

ActivityDictionaryDescription.md

 

Target Organisms

The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).

 

TargetOrganismDictionaryDescription.md

 

TargetOrganismDictionaryDescription.md

 

Diseases

Text for definitions goes here

This dictionary does not yet exist

EmanuelFaria commented 4 years ago

FYI: As I clean up each [dictionary].xml file and update their unique [DictionaryName]DictionaryDescription.md files, I have also updated the master INDEX of Oil186 Dictionary Descriptions here: (INDEXofOIL186Dictionaries.md)

EmanuelFaria commented 4 years ago

As of today, we have 11 finished dictionaries. They are:

  1. eoActivity
  2. eoAnalysisMethod
  3. eoCompound
  4. eoExtractionMethod
  5. eoPlant
  6. eoPlantMaterialHistory
  7. eoPlantPart
  8. eoTargetOrganism
  9. geoLocation
  10. humanDiseases
  11. pests

... as well as a master INDEX of their descriptions, pasted below:

Index Oil186 Dictionaries

This index contains information about the Manually Created Dictionaries for OIL186.

PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression.

The purpose/function of Dictionaries:

  1. Identify “things” as objects or concepts (eg. “e.coli" is a concept.).

  2. Give each object clear lexical names by which they can be searched.
    (An object that goes by more than one name is a synonym.)

  3. Give each object a link to wikidata (or other authorities) by which we can learn more about them.

 


EO Plant

eoPlant.md

 


EO Plant Part

eoPlantPart.md

 


Geo Location

geoLocation.md​

 


EO Plant Material History

eoPlantMaterialHistory.md

 


EO Extraction Method

eoExtractionMethod.md

 


EO​​ Analysis Method

Analytical chemistry studies and uses instruments and methods used to separate, identify, and quantify matter.[1] In practice, separation, identification or quantification may constitute the entire analysis or be combined with another method. Separation isolates analytes. Qualitative analysis identifies analytes, while quantitative analysis determines the numerical amount or concentration.

Analytical chemistry consists of classical, wet chemical methods and modern, instrumental methods.[2] Classical qualitative methods use separations such as precipitation, extraction, and distillation. Identification may be based on differences in color, odor, melting point, boiling point, radioactivity or reactivity. Classical quantitative analysis uses mass or volume changes to quantify amount. Instrumental methods may be used to separate samples using chromatography, electrophoresis or field flow fractionation. Then qualitative and quantitative analysis can be performed, often with the same instrument and may use light interaction, heat interaction, electric fields or magnetic fields. Often the same instrument can separate, identify and quantify an analyte.

(Source: https://en.wikipedia.org/wiki/Analytical_chemistry)

 

eoAnalysisMethod.md

 


EO Compound

Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.

Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).

 

eoCompound.md

 


EO Activity

eoActivity.md

 


EO Target Organism

The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).

 

eoTargetOrganism.md

 


Human Diseases

humanDiseases.md

 


Pests​

disease.md