ustclug / discussions

Issue Tracker for USTC LUG
47 stars 4 forks source link

反向代理镜像站下载大文件总是失败 #198

Open abcfy2 opened 6 years ago

abcfy2 commented 6 years ago

测试命令:

curl -O https://cloudera.proxy.ustclug.org/cm5/ubuntu/xenial/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.13.1-1.cm5131.p0.2~xenial-cm5_all.deb
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  701M    0 1567k    0     0   216k      0  0:55:16  0:00:07  0:55:09  241k
curl: (18) transfer closed with 734423739 bytes remaining to read

当下载文件容量比较大的时候,就时不时随机出现这种错误。

gaoyifan commented 6 years ago

感谢反馈。 原因是同时使用反向代理服务器的人数过多,导致缓冲空间被耗尽。

2017/12/08 23:31:14 [crit] 23434#23434: *18692359 pwritev() "/tmp/mem/nginx_temp/0000165591" failed (28: No space left on device) while reading upstream, client: 202.38.93.152, server: *.proxy.ustclug.org, request: "GET /cm5/ubuntu/xenial/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.13.1-1.cm5131.p0.2~xenial-cm5_all.deb HTTP/1.1", upstream: "https://151.101.8.167:443/cm5/ubuntu/xenial/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.13.1-1.cm5131.p0.2~xenial-cm5_all.deb", host: "cloudera.proxy.ustclug.org"
2017/12/08 23:31:35 [crit] 23435#23435: *18692749 pwritev() "/tmp/mem/nginx_temp/0000165594" has written only 8192 of 36864 while reading upstream, client: 202.38.93.152, server: *.proxy.ustclug.org, request: "GET /cm5/ubuntu/xenial/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.13.1-1.cm5131.p0.2~xenial-cm5_all.deb HTTP/1.1", upstream: "https://151.101.8.167:443/cm5/ubuntu/xenial/amd64/cm/pool/contrib/e/enterprise/cloudera-manager-daemons_5.13.1-1.cm5131.p0.2~xenial-cm5_all.deb", host: "cloudera.proxy.ustclug.org"

nginx不太适合反向代理大文件。 比如有用户访问A文件(大小1GB),那么在整个请求过程中nginx就会占用服务器1GB的存储空间。 如果用户下载时间很长,或者有大量这类并发请求,就会快速耗尽服务器的缓冲空间。

这还会带来另一个问题——网络波动。 由于反向代理服务器到上游的速度比较快(约20MB/s),会占满出国隧道的带宽,导致其他服务(如light)的网速出现较大幅度的波动。

个人建议: 对于大文件,从上游下载时每次只下载一小块,等用户取回后再下载下一块。以提升缓冲空间的利用效率。 https://github.com/Qihoo360/ngx_http_subrange_module (类似功能的模块还有很多)

abcfy2 commented 6 years ago

或者参考下nginx官方文档 https://www.nginx.com/blog/nginx-caching-guide/

我按照这个文档的优化建议,启用proxy_cache_lock on;,这样似乎并发请求的时候就只会生成一个cache文件,其他客户端请求的时候就会等待,从cache中获取数据,而不会都丢给上游。

leozhangzhang commented 5 years ago

今天下载cloudera的文件时候,老是提示retry image