JaniceSHLee / chnslab_projectCOVID

Allow all members in the group to use Github and apply tips from Open Data Science course.
0 stars 0 forks source link

covid data #2

Open speedysonic opened 4 years ago

speedysonic commented 4 years ago

Hi Esteye, Yuti:

For a start, I think we need to (1) find the data - is there a centralized database(s) out there or do we need to pull out information from separate sources

Lets work on finding out what data is out there and get back to each other by next monday, 27 july?

speedysonic commented 4 years ago

https://ourworldindata.org/covid-cases

nurestya commented 4 years ago

Hi Kai Wan and Yuti,

Thank you Kai Wan and Yuti, sorry I just realised about this comment function today. facepalm Yuti did put up the covid19analytics package on the CHNS lab chat group o WhatsApp: https://cran.r-project.org/web/packages/covid19.analytics/vignettes/covid19.analytics.html

For a start, I think there are many datasets we can choose, such as:

report.summary (geo.loc="Indonesia")

covid19.data

tots.per.location (covid19.confirmed.cases,geo.loc="Indonesia")

growth.rate (TS.data,geo.loc=:"Indonesia")

I am exploring these data as we speak and also am exploring the link you just sent. Will update on what I found later! I will be slightly late as I gotta collect a camera stand in Outram today for our flammbality experiments,

See you all soon!

speedysonic commented 4 years ago

I think the next thing we need to do is to find out from the rest what spatial scale should we be using. Are we doing for a few countries, a specific geographic region (eg. Southeast Asia) or the whole world. Also, for the covid data should we use multiple datasets and compare or stick to using one database. Finding out these will help us to divide the work.

nurestya commented 4 years ago

Ok, sounds good,

We can discuss this later on during the meeting at 4pm!

Thanks again @speedysonic!

speedysonic commented 4 years ago

looking at the other issue thread, they seem to want to do at a country level, at a monthly scale. Are we doing for every country or perhaps a few selected countries? Maybe we should try for a few countries first. Discuss with the rest later.

nurestya commented 4 years ago

We can do that, and see what we can find. Perhaps southeast asia and/or Indo-china?

I will be late later as I need to collect camera stand from Outram by the way, but I will try to reach as soon as I can for meeting.

Yuti-AI commented 4 years ago

Sounds good. Another option that aggregate COVID data is worldometer: https://www.worldometers.info/coronavirus/about/ although they also refer to John Hopkins database (among others). I think the scale depends on the availability of the deforestation data

nurestya commented 4 years ago

Ah ok, thank you for the link Yuti! Noted on the scale we intend to look at, and yes, we can check with Lubis and Anushka on the availability of deforestation data during meeting. Just to share a list of datasets we can tap on as well:

Just a few to play around with as a head start! Hahahah!

nurestya commented 4 years ago

IBM - Weather Data app and COVID-19 incidences (link is in the ZDnet Article)

Yuti-AI commented 4 years ago

I also found this one on Github https://github.com/joachim-gassen/tidycovid19. He compared different covid data sources and made a new package to do that. One of them refers to Kai Wan's ref as well the one world in data. I have the tendency to gravitate towards existing packages :D

nurestya commented 4 years ago

Hahahah, no worries, me too Yuti! Hahahahahahah! Thanks Yuti!

nurestya commented 4 years ago

Hi both,

I will take a look at the John Hopkins data in detail tomorrow and see which datasets I can find to use and probably align with Deforestation and Fire datasets. Will get back to you both by the end of tomorrow! Update soon! =DDD

speedysonic commented 4 years ago

I don't really know how to use R (sorry i weak), please let me know what I can do next.

nurestya commented 4 years ago

Yoo @speedysonic and Yuti, no worries about it! We are all learning too!

Currently I am using RStudio - RMarkdown.

I was playing around with the covid19.analytics data yesterday so far I managed to extract the dataset but for all country.region and the dates between Jan and July. I am still working it out but the next steps I have in mind are: (1) To collate all confirmed incidences/cases dataset from these following countries: 'Brunei','Cambodia','China','Indonesia', 'Japan','Korea,South','Malaysia','Singapore','Taiwan','Vietnam','Laos','Burma','India', 'Philippines','Thailand' (2) To merge all the daily data per month into column (basically we have to code a mathematical equation/function onto R so that they calculate for us. At the same time we also need to code to change the name of the column by matching to a new assigned label for eg. January, February, etc.

You may see my working attached (jpeg). Let me know if you cannot open it or don't understand, I tried to make it as tidy and clearer as possible hahahahah, but I can't Knit/export the script to html/pdf as the dataset is HUGE! So I attach the screenshots as jpeg format first here.

If you have better ideas/suggestions also, can suggest also! I am open to try out things too! chsncovid19script01 chsncovid19script02 chsncovid19script03

speedysonic commented 4 years ago

Eh, is there anything I can do on excel at the moment? I can only start learning the basics of R in Sept because i have to study for an exam for my part time course (ending at the end of the month)

Yuti-AI commented 4 years ago

@speedysonic I used the same codes of Estya and uploaded the CSV version of the data, so you can take a look at it and manipulate it in Excel. I tried to group the data to months but still did not manage to do it.

@nurestya thanks Estya :)