gilienv / EssOilDB

Restructuring of Essential Oil Database
Apache License 2.0
8 stars 6 forks source link

Open Access articles #91

Open petermr opened 4 years ago

petermr commented 4 years ago

Open Access can revolutionize EssoilDB.

I ran a search:

getpapers -q '("essential oil")'  -n

to get:

getpapers -q '("essential oil")' -n -x
info: Searching using eupmc API
info: Running in no-execute mode, so nothing will be downloaded
info: Found 6254 open access results

This is a HUGE advance (currently we only have 10!).

The advantages are:

So in V2.0 I propose we concentrate on importing new papers. It is likely that much of the import can be automated.

We should use the 10 papers in V2.0 to help us decide what fields should be extracted and what structure we use for the extraction.

For comparison purposes there are many more closed access:

$ getpapers -q '("essential oil")' -n -a
info: Searching using eupmc API
info: Running in no-execute mode, so nothing will be downloaded
info: Found 23576 results
petermr commented 4 years ago

I tried to find all Open Access articles in EPMC. It was messy - see

EssoilDB/tables/bibliography/open

which contains scripts to (a) query EPMC for metadata for all entries (b) search metadata for <isOpenAccess>Y</isOpenAccess>

Ultimately 10 entries were retrieved, 2 appeared to have corrupt links, and 2 papers were not terpenes, so we end with 6 in CProject open

This was converted and analyzed with AMI:

MacBook-Pro-3:open pm286$ ami-search-new -p open/ --dictionary species country funders monoterpene

Generic values (AMISearchTool)
================================
cproject            /Users/pm286/workspace/projects/EssOilDB/tables/bibliography/epmc/open/open
ctree               
cTreeList           6 trees [open/PMC2900066, open/PMC3963878, open/PMC4044670

Specific values (AMISearchTool)
================================
dictionaryList       [species, country, funders, monoterpene]
dictionarySuffix     [xml]
cProject: open

cmd> word(frequencies)xpath:@count>20~w.stopwords:pmcstop.txt_stopwords.txt
cmd> species(binomial)
cmd> search(country)
cmd> search(funders)
cmd> search(monoterpene)

0    [main] DEBUG org.contentmine.ami.plugins.CommandProcessor  - running NORMA -i fulltext.xml -o scholarly.html --transform nlm2html --project open
PMC2900066 .PMC3963878 PMC4044670 PMC6253779 PMC6259589 PMC6268859 
running: word; word([frequencies])[{xpath:@count>20}, {w.stopwords:pmcstop.txt stopwords.txt}]PMC2900066 .PMC3963878 PMC4044670 PMC6253779 PMC6259589 PMC6268859 ...
running: species; species([binomial])[]SP: open...
running: search; search([country])[]...
running: search; search([funders])[]...
running: search; search([monoterpene])[]...
create data tables
rrrrMacBook-Pro-3:open pm286$ git status
ambarishK commented 4 years ago

Sir,

This is a HUGE advance (currently we only have 10!).

please explain the situation a little bit in context of the above figure.

petermr commented 4 years ago

Sir,

This is a HUGE advance (currently we only have 10!).

please explain the situation a little bit in context of the above figure.

Only 10 of the articles in E1.0 are "Open Access", i.e. you can read them without a subscription. Most of the entries are from J.Essential.Oil.Res. which can only be read by subscribers. I assume NIPGR is a subscriber but I am not.

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

EmanuelFaria commented 4 years ago

Attachment available until Sep 7, 2019 Hi Team!!

If paywalled articles are what you’re looking for…

I made this ePub https://www.dropbox.com/s/l9i4es3rgup0x30/My%20Super-Secret%20Science%20Stash.epub?dl=0 (attached below too) just for you!

It’s got all my best links, tools and extensions to easily get EVERYTHING you need — and MORE!.

Get Your Geek On!!! 🤓

🤜💥🤛 Manny

Emanuel Faria Founder | Formulator | President VERRICLEAR NATURAL SKIN ESSENTIALS LTD. Nature + Science = Success!™
North America: www.verriclear.com http://www.verriclear.com/ South America: www.verriclear.com.br http://www.verriclear.com.br/


"It's a funny thing about life; if you refuse to accept anything but the best, you very often get it." — W. Somerset Maugham —

Click to Download https://www.icloud.com/attachment/?u=https%3A%2F%2Fcvws.icloud-content.com%2FB%2FAbybqyM1QWsR0eWwladrXFjpmwgXAUO7vlEcC7CTqHG4kYM6U3RkrQuE%2F%24%7Bf%7D%3Fo%3DAvzFb9aQKukk7eFJisdXTWY2clDmC6_tqW6aKetpemAU%26v%3D1%26x%3D3%26a%3DCAog-RkanjAojTUjPQ3D8UQWS7CTiKDwfk3hDu_3ZvENXrISJxC62f2Axy0Yuun41NAtIgEAKggByAD_KciEY1IE6ZsIF1oEZK0LhA%26e%3D1567841203%26k%3D%24%7Buk%7D%26fl%3D%26r%3D2717484A-7DF3-491D-9ACB-E348DAE6E3AA-1%26ckc%3Dcom.apple.largeattachment%26ckz%3D0FFDC372-01BA-470A-8A68-10567361E48F%26p%3D27%26s%3D-evcYctfEhII1P4T_31qOmJx9PY&uk=hmWDO1vgQw1OuwTgil1eeA&f=My%20Super-Secret%20Science%20Stash.epub&sz=19962649My Super-Secret Science Stash.epub 20 MB

** CONFIDENTIALITY NOTICE ** This email message, including any attachments, may contain information that is confidential, privileged, and/or proprietary. If you are not an intended recipient, please be advised that any review, use, reproduction or distribution of this message is prohibited. The information and documents electronically transmitted are private, may include privileged communications and may contain confidential information intended only for the person named above. Nothing in this electronic transmission is intended to waive the confidentiality of this message or any attachment. Any other distribution, copying or disclosure is not intended by the sender and may result in the breach of certain laws or the infringement of rights of third parties. If you have received this message in error, please completely destroy all electronic and hard copies, and contact the sender at emanuel@verriclear.com mailto:emanuel@verriclear.com. Thank you for your co-operation.

Although we run anti-virus software we caution that every recipient should scan this e-mail and any attached files for viruses, worms and the like. Neither the writer nor its assignees accepts any liability for any loss, liability, damage or expense resulting directly or indirectly from the access of any files attached to this message.

VERRICLEAR Natural Skin Essentials Ltd. does not provide medical advice or services, and nothing in this e-mail or any document published by VERRICLEAR should be construed as such.

On Aug 7, 2019, at 8:25 AM, petermr <notifications@github.com mailto:notifications@github.com> wrote:

Sir,

This is a HUGE advance (currently we only have 10!).

please explain the situation a little bit in context of the above figure.

Only 10 of the articles in E1.0 are "Open Access", i.e. you can read them without a subscription. Most of the entries are from J.Essential.Oil.Res. which can only be read by subscribers. I assume NIPGR is a subscriber but I am not.

-- Peter Murray-Rust Founder ContentMine.org http://contentmine.org/ and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications&email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJK2M5H4DT6IEU5MRRQZGTQDKWJ3ANCNFSM4II7NPFQ.

http://schema.org/ https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications\u0026email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720 https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications\u0026email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720 https://github.com/

ambarishK commented 4 years ago

Thank you so much sir. I really find it as a good collection!!

EmanuelFaria commented 4 years ago
        You are very welcome, Abarish!I’m always happy to help!:)Manny---- On Thu, 08 Aug 2019 04:08:03 -0400  Ambarish Kumar<notifications@github.com> wrote ----Thank you so much sir. I really find it as a good collection!! 

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

petermr commented 4 years ago

On Thu, Aug 8, 2019 at 8:26 AM Manny notifications@github.com wrote:

Attachment available until Sep 7, 2019 Hi Team!!

If paywalled articles are what you’re looking for…

I made this ePub < https://www.dropbox.com/s/l9i4es3rgup0x30/My%20Super-Secret%20Science%20Stash.epub?dl=0> (attached below too) just for you!

Manny,

I don't know what is in this but Don't post paywalled articles to Github... also remember everything in this thread is visible. Let's talk. If I had your email we could communicate that way less visibly. I am peter DOT murray DOT rust on gmail

It’s got all my best links, tools and extensions to easily get EVERYTHING you need — and MORE!.

Get Your Geek On!!! 🤓

🤜💥🤛 Manny

Emanuel Faria Founder | Formulator | President VERRICLEAR NATURAL SKIN ESSENTIALS LTD. Nature + Science = Success!™ North America: www.verriclear.com http://www.verriclear.com/ South America: www.verriclear.com.br http://www.verriclear.com.br/


"It's a funny thing about life; if you refuse to accept anything but the best, you very often get it." — W. Somerset Maugham —


Click to Download < https://www.icloud.com/attachment/?u=https%3A%2F%2Fcvws.icloud-content.com%2FB%2FAbybqyM1QWsR0eWwladrXFjpmwgXAUO7vlEcC7CTqHG4kYM6U3RkrQuE%2F%24%7Bf%7D%3Fo%3DAvzFb9aQKukk7eFJisdXTWY2clDmC6_tqW6aKetpemAU%26v%3D1%26x%3D3%26a%3DCAog-RkanjAojTUjPQ3D8UQWS7CTiKDwfk3hDu_3ZvENXrISJxC62f2Axy0Yuun41NAtIgEAKggByAD_KciEY1IE6ZsIF1oEZK0LhA%26e%3D1567841203%26k%3D%24%7Buk%7D%26fl%3D%26r%3D2717484A-7DF3-491D-9ACB-E348DAE6E3AA-1%26ckc%3Dcom.apple.largeattachment%26ckz%3D0FFDC372-01BA-470A-8A68-10567361E48F%26p%3D27%26s%3D-evcYctfEhII1P4T_31qOmJx9PY&uk=hmWDO1vgQw1OuwTgil1eeA&f=My%20Super-Secret%20Science%20Stash.epub&sz=19962649>My Super-Secret Science Stash.epub 20 MB

** CONFIDENTIALITY NOTICE ** This email message, including any attachments, may contain information that is confidential, privileged, and/or proprietary. If you are not an intended recipient, please be advised that any review, use, reproduction or distribution of this message is prohibited. The information and documents electronically transmitted are private, may include privileged communications and may contain confidential information intended only for the person named above. Nothing in this electronic transmission is intended to waive the confidentiality of this message or any attachment. Any other distribution, copying or disclosure is not intended by the sender and may result in the breach of certain laws or the infringement of rights of third parties. If you have received this message in error, please completely destroy all electronic and hard copies, and contact the sender at emanuel@verriclear.com mailto:emanuel@verriclear.com. Thank you for your co-operation.

Although we run anti-virus software we caution that every recipient should scan this e-mail and any attached files for viruses, worms and the like. Neither the writer nor its assignees accepts any liability for any loss, liability, damage or expense resulting directly or indirectly from the access of any files attached to this message.

VERRICLEAR Natural Skin Essentials Ltd. does not provide medical advice or services, and nothing in this e-mail or any document published by VERRICLEAR should be construed as such.

On Aug 7, 2019, at 8:25 AM, petermr <notifications@github.com <mailto: notifications@github.com>> wrote:

Sir,

This is a HUGE advance (currently we only have 10!).

please explain the situation a little bit in context of the above figure.

Only 10 of the articles in E1.0 are "Open Access", i.e. you can read them without a subscription. Most of the entries are from J.Essential.Oil.Res. which can only be read by subscribers. I assume NIPGR is a subscriber but I am not.

-- Peter Murray-Rust Founder ContentMine.org http://contentmine.org/ and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications&email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720>, or mute the thread < https://github.com/notifications/unsubscribe-auth/ACJK2M5H4DT6IEU5MRRQZGTQDKWJ3ANCNFSM4II7NPFQ

.

http://schema.org/ < https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications\u0026email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720> < https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications\u0026email_token=ACJK2M7RMDH3AYTUMQ2PSG3QDKWJ3A5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YCS2A#issuecomment-519055720> https://github.com/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gilienv/EssOilDB/issues/91?email_source=notifications&email_token=AAFTCS4LEQU3ESFYDRK2LMDQDPDEDA5CNFSM4II7NPF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32XEKA#issuecomment-519402024, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSZLAWZ2W5FVG7NJZNLQDPDEDANCNFSM4II7NPFQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK