petermr / CEVOpen

Contentmining of Open phytochemical literature for medicinal activities
26 stars 19 forks source link

Naming files and consistency #26

Closed petermr closed 4 years ago

petermr commented 4 years ago

Naming files and documentating the contents is critical.

Please create short meaningful filenames. Please DONT create long filenames with spaces or punctuation in.

Manny's Activity Table RAW for Ambarish 2019-10-02.tsv

This name contains spaces (which cause HUGE problems), punctuation (even worse) and personal names which mean nothing to newcomers.

This file could be named:

.../raw/activityClassification20191001.tsv

variants should simply have a date:

/raw/activityClassifcation20191003.tsv

so we can tell it's a derivative of previous ones.

content

A TSV or CML file should ONLY have consistent column data:

Essoil DB Correction needed Multi CAIDs caid    caname  caName and Synonyms ActivityClass   (Phytomedicinal Activity) ActionType    Activity Description

These are the column headings, and only data of each type should be held in any column.

 (AMBRISH: Please try to extract the activity descriptions into a new column/field. These need to be verified somehow)  Specific ActivityTarget "Reported Uses (Diseases, Symptoms, etc.)"  MannyNotes_Compound Activity    "Route of use: Topical
Oral
Uncertain
Both"

This should be in a metadata file (README.md) which should also be incremented for new data/columns. (This is not an easy area - XML is better suited)

Please do NOT include "Manny", "Ambarish" "New", etc.

Also document versions in README.md files.

EmanuelFaria commented 4 years ago

Got it. Thanks peter.

EmanuelFaria commented 4 years ago

FYI: I added I created a file called README.md but the formatting tools were not available. Don't know what I've done wrong.

petermr commented 4 years ago

On Github this is a Markdown file,so name is

README.md

On Thu, Oct 3, 2019 at 7:41 PM Emanuel Faria notifications@github.com wrote:

FYI: I added I created a file called README.md but the formatting tools were not available. Don't know what I've done wrong.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/26?email_source=notifications&email_token=AAFTCS6G3SOWLV65Q3NI5YDQMY4G7A5CNFSM4I5EF5QKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAJFULY#issuecomment-538073647, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS42HYNCL5MAZULIIHTQMY4G7ANCNFSM4I5EF5QA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK