Billyzou0741326 / bilibili-live-monitor-ts

bilibili - b站直播监控
https://billyzou0741326.github.io/bilibili-live-monitor-ts/
MIT License
28 stars 4 forks source link

遍历分区问题 #46

Open luoshuijs opened 4 years ago

luoshuijs commented 4 years ago

刚刚我朋友过来反馈:

我发现它漏了好多
我发现漏了一堆 特别是vtb这边
感觉V区单独独立出来然后目前扫描不到??
我发现V在这边的舰长都没领
真的 昨晚我手动领了好多v的并且后台并没有报错
证明并没有被扫描到
不是一次两次了 好几次 有一个v的直播间十个舰长我等了将近两分钟
然后点还是10个都在并且后台也没有报错说无法领取

我也感觉到最近也有点问题(然而我看不懂nodejs)

blu3mania commented 4 years ago

@Billyzou0741326, when you reverted partial changes for "protover=2", you didn't remove it from _handshake.

Billyzou0741326 commented 4 years ago

Will fix with a couple other things in the next commit

Billyzou0741326 commented 4 years ago

Thanks for the reminder

Billyzou0741326 commented 4 years ago

@blu3mania Never mind, that was the expected behavior. We'll use v2 (adjusted for zlib compression), although the re commit was a bit misleading. Though it uses protover 2, the handshake and heartbeat are still expected to be sent as v1, at least for now.

@luoshuijs 测了下全站抓取 vtb应该没有漏才对

blu3mania commented 4 years ago

@Billyzou0741326 I know you reverted it back in prepareData, but I was talking about this line: https://github.com/Billyzou0741326/bilibili-live-monitor-ts/commit/db43ad0f9cf4fde641510866e5aafda994ace497#diff-9d5629e6ed0ad811435e90ce37a2ad3bR47. If I comment it out, I get more result. Though, I haven't figured out why.

Actually never mind, I was using an old version for comparison. Lemme re-test...

Yeah, tested and confirmed it works fine. Sorry for the false alarm.

Billyzou0741326 commented 4 years ago

Yeah, tested and confirmed it works fine. Sorry for the false alarm.

No worries, better false alarm than missed bugs.

Billyzou0741326 commented 4 years ago

@luoshuijs 房间确实可能有缺失 我看着改一下获取策略吧

luoshuijs commented 4 years ago

@Billyzou0741326 刚刚看后台日记我都懵逼了,从2020-04-11 20:20:50开始出现了大量的错误

 [2020-04-11 22:22:53]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:53]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:54]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:54]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:54]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:55]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:55]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:55]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:55]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:56]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:56]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:56]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:56]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:56]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-11 22:22:57]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-11 22:22:57]   Bilibili.getRoomsInArea - Http request timed out

今天刚刚拉去到最新的,之前用的是这个的版本https://github.com/Billyzou0741326/bilibili-live-monitor-ts/pull/43 。 然后7号开始挂到昨天发现问题的时候,中途没有停止运行过。 现在在问我那个朋友是不是出现这个问题。现在叫他拉一下日记。

luoshuijs commented 4 years ago

拿到日记了,我朋友那边在2020-04-10 14:38:01开始出现了大量的错误

 [2020-04-10 14:38:01]   Bilibili.getRoomsInArea - Http request errored - getaddrinfo ENOTFOUND api.live.bilibili.com
 [2020-04-10 14:38:02]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:03]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:03]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:03]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:03]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out
 [2020-04-10 14:38:05]   Bilibili.getRoomsInArea - Http request timed out

他大概是5号开始挂到昨天,然后更新了代码,他之前用的是这个版本https://github.com/Billyzou0741326/bilibili-live-monitor-ts/pull/42

Billyzou0741326 commented 4 years ago

@blu3mania Any ideas why the Connection: keep-alive header would cause request timeout and errors? I'm not sure if I misused the http client or it's node.js's problem, and I couldn't reproduce this on my end.

blu3mania commented 4 years ago

@Billyzou0741326 I don't think it has anything to do with "keep-alive". HttpError for request timeout is raised when nodejs HTTP agent detects idle socket, meaning the server doesn't respond in time or the request/response has been delayed along the path or got jammed up at sender/receiver's end. This is not the same as HTTP 408 status when using keep-alive, which I assume nodejs handles already. I have seen request timeouts from my local env and usually it happened during busy time when there were lots of connections from my end (and presumably on Bili's end as well). Personally I made request timeout configurable in settings and am currently using 10 seconds instead of the default 4. I still see it occasionally but not often. During busy time the error usually would be ECONNRESET.

Also, since luoshuijs's friend got the error on Apr 10, it happened before you changed Connection header so I assume it had more to do with network. Especially when you get ENOTFOUND it is a DNS related issue which has nothing to do with HTTP request itself. I used to get that from time to time, but I am now using the dynamic IP assignment based on Pearlulu's algorithm so I don't see this problem at all. As a side note, I also saw it from the raffle JS client occasionally in the past, and I added https://github.com/devswede/dns-cache which addressed that.

Pearlulu commented 4 years ago

看了下日志,发现我也有这个问题。保持的有效连接只有几个,是不是B站做了什么限制

Billyzou0741326 commented 4 years ago

https://t.bilibili.com/379202550902413701?tab=2

Pearlulu commented 4 years ago

那完蛋了,试了下有时候是8个,有时候是18个连接。我找找有没有其他办法。

Billyzou0741326 commented 4 years ago

commit 4aac231999e8a13f3efa07b07ef123ff3f581597 下调到分区级了 就这样吧 ip层面的限制我不打算试着解了

Pearlulu commented 4 years ago

主要是这种限制不太合理,比如我用电信,现在是内网IP,意味着N个人共用一个公网IP,也许只有broadcastlv.chat.bilibili.com做了限制

Billyzou0741326 commented 4 years ago

tx-live-dmcmt-sv-01.chat.bilibili.com -> html tx-tokyo-live-comet-01.chat.bilibili.com -> html broadcastlv.chat.bilibili.com -> tcp

还是只有 broadcastlv.chat.bilibili.com 可用 而且实际的限制规则大概率更复杂

Pearlulu commented 4 years ago

试了wss连接,也是一个IP精准限制到72个房间,应该是针对着限制的,估计凉了 我也只有十来个IP,完全拯救不了

Pearlulu commented 4 years ago

可惜了这么好的项目,练了练手其实也不太亏,缓一缓折腾下别的玩吧

Billyzou0741326 commented 4 years ago

来点好玩的项目 往我邮箱推 来者不拒

Pearlulu commented 4 years ago

要不整点烧脑子的,深度学习啥的,我看了一眼就放弃了。。

Pearlulu commented 4 years ago

我最近为了方便听vtuber的ASMR,搞了既简单又麻烦的转播的小玩意儿,就十几KB折腾了好久。

Billyzou0741326 commented 4 years ago

往邮箱来一个

Pearlulu commented 4 years ago

往邮箱来一个

代码少,但是有点麻烦,我整理下发给你玩吧。UI随便凑的,弄了点简单功能就懒得继续写了。

Pearlulu commented 4 years ago

好像现在解除限制了?

Billyzou0741326 commented 4 years ago

解的话那就上这个分支 https://github.com/Billyzou0741326/bilibili-live-monitor-ts/tree/before-4-17

Pearlulu commented 4 years ago

很奇怪,那些一个公司用一个IP的咋办,还有各种地方手机卡套餐,都是要走代理服务器,很多人共用一个IP,这样限制后这些问题咋解决的。

Pearlulu commented 4 years ago

果然限流就行了,每分钟请求数的阈值我还没试出来,能用就行。

Billyzou0741326 commented 4 years ago

上限72连接

blu3mania commented 4 years ago

@Billyzou0741326 I did lots of testing and found that connection rate is not the issue, but that their 4 IPs are limited to 18 connections each. So, if using Pearlulu's dynamic IP assignment, just make sure it resolves DNS first and then disable disconnection tracking. In this way 6 area monitor + 66 dynamic rooms can be setup. Though, it's still almost useless...

Pearlulu commented 4 years ago

用websocket试试,1秒连接一个房间,试了3页,同时连297个房间都没掉线。

Billyzou0741326 commented 4 years ago

设计合理的话是可以以继承实现ws的 可惜了