BlankerL / DXY-COVID-19-Data

2019新型冠状病毒疫情时间序列数据仓库 | COVID-19/2019-nCoV Infection Time Series Data Warehouse
https://lab.isaaclin.cn/nCoV/
MIT License
2.16k stars 707 forks source link

Timeline data #51

Closed feralheart closed 4 years ago

feralheart commented 4 years ago

Can you add a data timeline api to check the tendency of the virus?

BlankerL commented 4 years ago

Hello, I can hardly understand what do you mean by the "date timeline API"... I suppose you need time-series data? Actually, you can get the time-series data by using latest=0 when calling the API.

If you just need to visualize and see the spreading of virus, you can actually find it out here.

feralheart commented 4 years ago

Thank you, I only saw the jsons first. But this API is exactly what I was looking for

BlankerL commented 4 years ago

Thank you, I only saw the jsons first. But this API is exactly what I was looking for

Thank you for your support. I will also commit the time-series JSON to this project in the future. Maybe you can keep an eye on the project and switch to the JSON hosted on GitHub to cut down the traffic of the backend.

feralheart commented 4 years ago

I reopen this issue. I today wanted to continue the development of my app and I got this error: Screenshot from 2020-03-19 11-09-00

BlankerL commented 4 years ago

You can check the JSON time-series data here. For example, JSON for DXYArea: https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/json/DXYArea-TimeSeries.json

davpirelli commented 4 years ago

You can check the JSON time-series data here. For example, JSON for DXYArea: https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/json/DXYArea-TimeSeries.json

This seems too big to parse, do you have a solution for that?

BlankerL commented 4 years ago

You can check the JSON time-series data here. For example, JSON for DXYArea: https://raw.githubusercontent.com/BlankerL/DXY-COVID-19-Data/master/json/DXYArea-TimeSeries.json

This seems too big to parse, do you have a solution for that?

Because it contains more than 45k documents in the database. If you just would like to do research, you can simply use the DXYArea.csv file, the JSON time-series file is for some people's visualization purpose.

feralheart commented 4 years ago

Sorry, but this JSON is too big. Is there a chance to restore the API?

BlankerL commented 4 years ago

Sorry, but this JSON is too big. Is there a chance to restore the API?

Maybe the last thing I can do is to commit data for different provinces separately.

As you can see, the data is too large for the API to send to each user calling this API. The server only have 10Mbps bandwidth and 1 vCPU. Every time when 4 gunicorn threads are occupied by the latest=0 request (time-series data transfer), the server will just freeze.

Your multi-core personal computer can hardly parse the data, and the 1 vCPU server need to compile the time-series data more than 1K times a day.

Will committing the province data separately help you?

feralheart commented 4 years ago

Can you make a JSON/continent?

BlankerL commented 4 years ago

Can you make a JSON/continent?

It might be a good idea. However, the most serious thing is that the MongoDB on this server cannot even response to the db_dump(), the db_dump() will directly make the MongoDB stuck with 100% CPU...

I will try to find out a solution.

omar3000 commented 4 years ago

imagen

JSON file is empty.

BlankerL commented 4 years ago

JSON file is empty.

Fixed.

ludovic5971 commented 4 years ago

Hi,

I don't understand the field province_confirmedCount is cumulative or a number of infected per day. When I sum the province_confirmedCount field by the updateTime I find inconsistent values.

BlankerL commented 4 years ago

Hi,

I don't understand the field province_confirmedCount is cumulative or a number of infected per day. When I sum the province_confirmedCount field by the updateTime I find inconsistent values.

It is a cumulative value.

For example, province_confirmedCount in some place is 5 at 2 p.m., and changed to 6 at 3 p.m., this means there is 1 infected get confirmed and announced.

feralheart commented 4 years ago

The JSON is still too big

BlankerL commented 4 years ago

The JSON is still too big

And getting bigger and bigger.

feralheart commented 4 years ago

Then sorry, but I have to search for an another data provider

BlankerL commented 4 years ago

Then sorry, but I have to search for an another data provider

Sure, if you need to use time-series data for a specific country/province, I can only recommended you to write a script to download the GitHub JSON file every hour and store the content in your back-end. Every time there is a request to your site, send the content to the front-end users.

If you would like to only rely on this API to handle all this kind of requests, other users who need to request and get data will be hardly to fetch the data (because sending time-series data to countries other than China will be very slow, and once all the 4 threads are occupied, others will not be able to call the API), which is the main reason of shutting down the response of time-series data.