petermr / CEVOpen

Contentmining of Open phytochemical literature for medicinal activities
26 stars 19 forks source link

Add WikidataID refs to articles #25

Open petermr opened 4 years ago

petermr commented 4 years ago

Many open access articles have IDs in Wikidata (i.e the bibliography itself has an ID). For example: The article with title "Thymus vulgaris essential oil: chemical composition and antimicrobial activity." has Wikidata ID: "Q35340706" If we can find WikidataIDs for all artciles that would be very useful.

The easiest thing is to search for PMCID=Q35340706

Ambarish, please try to retrieve these IDs for oil186 and make a simple table (CSV) with PMCID and Wikidata ID.

ambarishK commented 4 years ago

Yes sir.

petermr commented 4 years ago

@ambarishK please add WikidataIDs to articleAnalysis table as column 2 (immediately after PMCID).

ambarishK commented 4 years ago

Sir, check for the updated sheet for activity test for species with WIKIDATA ID.

petermr commented 4 years ago

Please read the Github message:

We can make this file beautiful and searchable
<https://help.github.com/articles/rendering-csv-and-tsv-data> if this error
is corrected: Illegal quoting in line 67.
PMCID WIKIDATAID Location Plant LiteratureActivities TargetOrganisms Target
species Activity table Activity figure EO com

This means the table is not regular (line 67 is corrupt). Please correct this. If there are more errors they will show up one-by-one

On Fri, Oct 4, 2019 at 11:49 AM Ambarish Kumar notifications@github.com wrote:

Sir, check for the updated sheet https://github.com/petermr/CEVOpen/blob/master/project/articleAnalysis/raw/manualAnalysis186_20191003.tsv for activity test for species with WIKIDATA ID.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/25?email_source=notifications&email_token=AAFTCS5DBJQEOMKJDIEMFNLQM4NTBA5CNFSM4I5D3ZXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALI2EI#issuecomment-538348817, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSZZCKCWNPV46KTH72LQM4NTBANCNFSM4I5D3ZXA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr commented 4 years ago

The WikidataIDs look good, thank you. There are only 2 not found.

But look at the table in the Github repo - there are several irregular lines so something is wrong in those lines. Please correct them.

On Fri, Oct 4, 2019 at 12:20 PM Peter Murray-Rust < peter.murray.rust@googlemail.com> wrote:

Please read the Github message:

We can make this file beautiful and searchable
<https://help.github.com/articles/rendering-csv-and-tsv-data> if this
error is corrected: Illegal quoting in line 67.
PMCID WIKIDATAID Location Plant LiteratureActivities TargetOrganisms
Target species Activity table Activity figure EO com

This means the table is not regular (line 67 is corrupt). Please correct this. If there are more errors they will show up one-by-one

On Fri, Oct 4, 2019 at 11:49 AM Ambarish Kumar notifications@github.com wrote:

Sir, check for the updated sheet https://github.com/petermr/CEVOpen/blob/master/project/articleAnalysis/raw/manualAnalysis186_20191003.tsv for activity test for species with WIKIDATA ID.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/25?email_source=notifications&email_token=AAFTCS5DBJQEOMKJDIEMFNLQM4NTBA5CNFSM4I5D3ZXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALI2EI#issuecomment-538348817, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSZZCKCWNPV46KTH72LQM4NTBANCNFSM4I5D3ZXA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago

OK sir.

petermr commented 4 years ago

File CEVOpen https://github.com/petermr/CEVOpen/project https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis/raw https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis/raw /

*manualAnalysis51_20190930.tsv*is well formatted but the later ones are not. The TSV files should all look "beautiful" like this one

On Fri, Oct 4, 2019 at 12:24 PM Ambarish Kumar notifications@github.com wrote:

OK sir.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/25?email_source=notifications&email_token=AAFTCS3IR6J7PJUMX43N5L3QM4RWRA5CNFSM4I5D3ZXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALLDUY#issuecomment-538358227, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS73GQLLPQJNM2NNC73QM4RWRANCNFSM4I5D3ZXA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago

Sir, I made all corrections to lines of the file - https://github.com/petermr/CEVOpen/blob/master/project/articleAnalysis/raw/manualAnalysis186_20191003.tsv.

Errors were because of quotes and extra line spacing.

Now it looks good.

petermr commented 4 years ago

On Fri, Oct 4, 2019 at 12:54 PM Ambarish Kumar notifications@github.com wrote:

Sir, I made all corrections to lines of the file https://github.com/petermr/CEVOpen/blob/master/project/articleAnalysis/raw/manualAnalysis186_20191003.tsv .

Thank you. PLEASE GIVE ACTUAL FILENAME in message, not "the file"

Errors were because of quotes and extra line spacing.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/25?email_source=notifications&email_token=AAFTCS7HI3TQXYDESETXE3DQM4VF7A5CNFSM4I5D3ZXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALNBNA#issuecomment-538366132, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS5M4VPLLKCEIBCUOXDQM4VF7ANCNFSM4I5D3ZXA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ambarishK commented 4 years ago

OK sir. I updated previous comment.

petermr commented 4 years ago

The file: CEVOpen https://github.com/petermr/CEVOpen/project https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis/raw https://github.com/petermr/CEVOpen/tree/master/project/articleAnalysis/raw /manualAnalysis186_20191003.tsv

is satisfactory. Thank you. I will scan it for problems.

On Fri, Oct 4, 2019 at 1:03 PM Ambarish Kumar notifications@github.com wrote:

OK sir. I updated previous comment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/25?email_source=notifications&email_token=AAFTCS5IKQNBG66GDZEKFWLQM4WJ3A5CNFSM4I5D3ZXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALNWZY#issuecomment-538368871, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSYKRFEK33BECDFWRQTQM4WJ3ANCNFSM4I5D3ZXA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK