Open justaddcoffee opened 4 years ago
Sounds like @realmarcin will implement this, and Bill can review his mapping/transform
Added a documentation directory for the C3.ai API.
We can place other documentation there as needed.
cc @deepakunni3 @realmarcin
Note that if this ingest requires an API call, this ticket will then probably also require a function in download()
to call the API and emit a file
see quickstart.ipynb
for examples of how to access data
Discussion with Bill - initial thought is to ingest the following for each case:
age
sex
location
symptoms
therapeutic/clinical intervention
outcome
covid genome sequence
Per Marcin's sharp eyes, we could just ingest clinical data upstream, from where C3.ai ingests it: https://raw.githubusercontent.com/beoutbreakprepared/nCoV2019/master/latest_data/latestdata.csv
@justaddcoffee @realmarcin
Where did you find these URLs for the data? I can't find them.
These two links seem to be the upstream source of some of C3.ai data - are they not working for you?
https://raw.githubusercontent.com/beoutbreakprepared/nCoV2019/master/latest_data/latestdata.csv https://docs.google.com/spreadsheets/d/e/2PACX-1vQU0SIALScXx8VXDX7yKNKWWPKE1YjFlWc6VTEVSN45CklWWf-uWmprQIyLtoPDA18tX9cFDr-aQ9S6/pubhtml
Yes. They work for me.
I was curious how you found them in the documenation.
Name of the dataset ~Clinical data from C3.ai data lake, for example case (age/gender/location/symptoms/date of onset) from https://c3.ai/covid-19-api-documentation/~
Per Marcin's sharp eyes, we could just ingest clinical data upstream, from where C3.ai ingests it: https://github.com/Knowledge-Graph-Hub/kg-covid-19/issues/102
Also here: https://docs.google.com/spreadsheets/d/e/2PACX-1vQU0SIALScXx8VXDX7yKNKWWPKE1YjFlWc6VTEVSN45CklWWf-uWmprQIyLtoPDA18tX9cFDr-aQ9S6/pubhtml
Mapping or relevant fields TBD
If possible, highlight which fields map to nodes and which fields map to edges. Refer to Data Preparation for guidelines on how the final transformed data should be represented.