petermr / CEVOpen

Contentmining of Open phytochemical literature for medicinal activities
26 stars 19 forks source link

plant parts dictionary #34

Closed petermr closed 4 years ago

petermr commented 4 years ago

@ambarishK Can you (A) transfer the plant parts dictionary from E1.0 to CEV? add PlantPart ID (PPID). Call the dictionary plantpart (A1) link the plarts to Wikidata (B) do a manual analysis of plant parts in oil186 . Where the part(s) are in the dictionary give the PPID ; where not, make a list and highlight any new ones.

ambarishK commented 4 years ago

OK sir. I will have to get it from E1.0 ( plant info sheet ) and prepare dictionary for them.

ambarishK commented 4 years ago

Sir, please check for the normalized list of plant-parts.

plantParts20191010.tsv

Total count - 135

Still, there are plant parts which appear similar to each other.


fruit (3/4th yellow)
fruit (color turn)
fruit (dark green)
fruit (full yellow)
fruit (half yellow)
fruit (Juice extract)
fruit (light green)
fruit (ripe)
fruit (soaked chopped)
fruit (soaked grounded)
fruit (unripe cut)
fruit (unripe Intact)
petermr commented 4 years ago

On Thu, Oct 10, 2019 at 11:23 AM Ambarish Kumar notifications@github.com wrote:

Sir, please check for the normalized list of plant-parts.

plantParts20191010.tsv https://github.com/petermr/CEVOpen/blob/master/dictionary/plant/raw/plantParts20191010.tsv

Thanks I will edit this...

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/34?email_source=notifications&email_token=AAFTCSYM26NFM7JUBRCFKE3QN37BFA5CNFSM4I6UUKTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA3XEOY#issuecomment-540504635, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS4D5LCXDJQ76ZDJ3CTQN37BFANCNFSM4I6UUKTA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago
Sir, how to get wikidata id for plant parts?

All most all queries generates scientific articles.

petermr commented 4 years ago

ami-dictionary will look up words in wikidata. (Is that not what you are doing anyway for the other dictionaries)??

On Thu, Oct 10, 2019 at 5:39 PM Ambarish Kumar notifications@github.com wrote:

Sir, how to get wikidata id for plant parts?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/34?email_source=notifications&email_token=AAFTCSYHACQUTP54KBWH5JTQN5LE5A5CNFSM4I6UUKTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA47ZLY#issuecomment-540671151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS4ROA7PNTZY5G6245TQN5LE5ANCNFSM4I6UUKTA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago

Yes sir.

petermr commented 4 years ago

Have edited plantparts and also created a new plantparts/ dictionary.

ambarishK commented 4 years ago

Sir, please go through the updated plantParts20191010.tsv.

Column description is as follows.


Dictionary for plant parts is plantParts.xml.

ambarishK commented 4 years ago

Sir, please check for the extracted plant parts snippet from oil186.

File - plantPartsOil186.csv

Extracted records - 30

In process ....

petermr commented 4 years ago

Thanks Ambarish, this looks a good start. (I was on another call )

On Sat, Oct 12, 2019 at 11:16 AM Ambarish Kumar notifications@github.com wrote:

Sir, please check for the extracted plant parts snippet from oil186.

File - plantPartsOil186.csv https://github.com/petermr/CEVOpen/blob/master/dictionary/plantparts/raw/plantPartsOil186.csv

Extracted records - 30

In process ....

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/34?email_source=notifications&email_token=AAFTCS4RVY3XRIJDR557UCDQOGPYTA5CNFSM4I6UUKTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBB35KA#issuecomment-541310632, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6RI4GHY3OTQHGFOFLQOGPYTANCNFSM4I6UUKTA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr commented 4 years ago

@ambarishK thanks for the oil186 mentions. TASK: please extend to rest of the 186 entries (later we will differentiate plant parts and processing)

petermr commented 4 years ago

Make sure that the description of the material in the paper is separated into the plant part and the processing: Examples:

In general a qualifier in brackets will be part of the process, not the plantpart.

ambarishK commented 4 years ago

Sir, please go through plant parts dictionary file - plantParts20191014.xml

Script to prepare dictionary for plant parts - plantParts20191014.sh

Total records - 20.

petermr commented 4 years ago

Ambarish You are STILL putting "scientific article" into the dictionaries For example https://www.wikidata.org/wiki/Q59105157

This is wasting my time, and also corrupting the dictionaries. There must be NO MORE scientific articles.

instance of scholarly article

petermr commented 4 years ago

Ambarish you MUST check this before sending it to me:

<entry term="bulbs" wikipedia="bulbs" wikidata="Q4996086" name="‎Bulbs‎" description="1974 single by Van Morrison" id="CM.plantParts.6"/>

THIS IS NOT A PLANT-PART

THIS IS NOT A PLANT-PART

THIS IS NOT A PLANT-PART

<entry term="stem" wikipedia="stem" wikidata="Q134267" name="‎plant stem‎" description="one of 2 main structural axes of a vascular plant (together with the root), that supports leaves, flowers and fruits, transports fluids between the roots and the shoots in the xylem and phloem, stores nutrients and produces new living tissue" id="CM.plantParts.20"/>

On Mon, Oct 14, 2019 at 11:10 AM Ambarish Kumar notifications@github.com wrote:

Sir, please go through plant parts dictionary file - plantParts20191014.xml https://github.com/petermr/CEVOpen/blob/master/dictionary/plantparts/raw/plantParts20191014.xml

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/34?email_source=notifications&email_token=AAFTCS3OCDSCAIYRD2BEOM3QORATBA5CNFSM4I6UUKTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBEA6EI#issuecomment-541593361, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS3Z64IEQ5BX3TKMQDTQORATBANCNFSM4I6UUKTA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago

Sir, made all corrections.

Please check for updated files - plantParts20191014.xml

plantParts20191014.tsv