BlankerL / DXY-COVID-19-Crawler

2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API
https://lab.isaaclin.cn/nCoV/
MIT License
1.99k stars 400 forks source link

Notification of Server Maintenace | 服务器维护通知 #82

Closed BlankerL closed 4 years ago

BlankerL commented 4 years ago

Thank you so much for your support. Due to the increasing severity of overseas epidemics, the number of API calls has risen sharply.

This project was deployed on a single-core 2GB 10Mbps server at my own expense, running the backend of the API (4 gunicorn threads), MongoDB, Crawler, and Data Warehouse Updating Automator at the same time, and the capacity for requests was very limited. At present, it responds to more than 500,000 API calls, and outbound traffic exceeds 20GB per day, I am afraid that it has seriously exceeded the load capacity of the server.

In addition, many websites call this API directly through the front-end, and some of them have very large traffic. The referer of Nginx log file shows that the front-end of those sites will call the API more than 3-5 times per second, which puts a lot of pressure on the server.

If your website has a large amount of traffic in the backend, please deploy a local data cache or at least use my hourly-updated GitHub Data Warehouse, and directly request the cached data of the backend through the front end to complete the data request without forwarding all the traffic to this API, otherwise, it is difficult to respond to such a huge The number of requests.

I need to close down the port 80 and 443 to maintain the server for about 1 hour. Please wait until the maintenance is complete.


感谢大家对此项目的支持,近期由于海外疫情日益严重,接口访问量急剧上升。

本项目是我自费部署在单核2G10Mbps的服务器上的,同时运行着网页后端(4个gunicorn线程)、MongoDB、丁香园数据爬虫、GitHub数据推送服务,并且这样的服务器配置对访问量的承载能力十分有限。目前每天响应超过50万次API调用,出站流量超过20GB,恐怕已经严重超过服务器的负载能力。

另外,不少网站直接通过前端调用本API,有个别网站的访问量巨大,每秒钟该Referer带来的流量就超过3-5次,对服务器造成了很大的压力。

如果您的网站有较大的流量,麻烦在本地做一个数据缓存,通过前端直接请求本地后端的缓存数据来完成数据请求,而不要将所有流量都转发至本API中,否则难以响应如此庞大的请求数量。

我需要关闭80/443端口对服务器进行为期1小时左右的维护,请在维护完成后继续使用。

BlankerL commented 4 years ago

Currently, except for the data warehouse updating automator, other services have all been re-launched. Please do data cache on your own backend in case the server goes offline accidentally.

In addition, in the future, I might block the forwarded traffic to ensure the interests of the majority of users.


目前,除数据仓库自动推送服务外,其他服务均已经重新上线。麻烦各位自行进行数据缓存,以防服务器意外掉线。

另外,不排除在未来为保证多数用户利益而对前端转发流量进行屏蔽的可能。

BlankerL commented 4 years ago

Data warehouse updating automator has been re-launched.


数据仓库自动推送服务已经重新上线。

BlankerL commented 4 years ago

Recently, DXY frequently changes the data structure, which makes the crawler stuck into an infinite loop and crash the server. I will fix it within half an hour.


近期丁香园频繁修改数据结构,导致爬虫反复掉线并陷入死循环。服务器大约在30分钟后重新上线。

BlankerL commented 4 years ago

Relaunched.


已重新上线。