Closed francisafu closed 1 month ago
That's interesting, because your configuration looks good to me. Let's start with the obvious suggestions, have you attempted to turn down the stack and turn it up again? :D
That's interesting, because your configuration looks good to me. Let's start with the obvious suggestions, have you attempted to turn down the stack and turn it up again? :D
Yeah of course, multiple times turn on&off, compose up&down, doesn't work, still the same problem occurred.
HOARDER_VERSION=release
NEXTAUTH_SECRET=some_random_keys
MEILI_MASTER_KEY=some_other_random_keys
NEXTAUTH_URL=http://192.168.124.2:6600
MAX_ASSET_SIZE_MB=20480
OPENAI_API_KEY=fk**************
OPENAI_BASE_URL=https://*****.net
INFERENCE_LANG=chinese
在crawlerWorker.ts前插入console.log(e);
后发现是worker会下载github上adblocker的easylist规则,而且不知道为什么使用环境变量设置代理了也没用。
使用hosts指定github的ip地址,就可以正常了
在crawlerWorker.ts前插入
console.log(e);
后发现是worker会下载github上adblocker的easylist规则,而且不知道为什么使用环境变量设置代理了也没用。 使用hosts指定github的ip地址,就可以正常了
Well, it doesn't seems like a network problem, I tried to fix it as you said, however the problem still exists.
/app/apps/workers # ping github.com
PING github.com (140.82.112.4): 56 data bytes
64 bytes from 140.82.112.4: seq=0 ttl=47 time=252.627 ms
64 bytes from 140.82.112.4: seq=1 ttl=47 time=252.859 ms
64 bytes from 140.82.112.4: seq=2 ttl=47 time=252.153 ms
64 bytes from 140.82.112.4: seq=3 ttl=47 time=252.746 ms
64 bytes from 140.82.112.4: seq=4 ttl=47 time=252.870 ms
^C
--- github.com ping statistics ---
6 packets transmitted, 5 packets received, 16% packet loss
round-trip min/avg/max = 252.153/252.651/252.870 ms
/app/apps/workers # ping raw.githubusercontent.com
PING raw.githubusercontent.com (185.199.111.133): 56 data bytes
64 bytes from 185.199.111.133: seq=0 ttl=54 time=111.197 ms
64 bytes from 185.199.111.133: seq=1 ttl=54 time=110.841 ms
64 bytes from 185.199.111.133: seq=4 ttl=54 time=112.224 ms
64 bytes from 185.199.111.133: seq=5 ttl=54 time=113.838 ms
64 bytes from 185.199.111.133: seq=6 ttl=54 time=111.442 ms
^C
--- raw.githubusercontent.com ping statistics ---
7 packets transmitted, 5 packets received, 28% packet loss
round-trip min/avg/max = 110.841/111.908/113.838 ms
2024-07-18T09:41:07.591Z info: [Crawler][17] Will crawl "https://www.baidu.com/" for link with id "atrqsg02v8ugw7fwehlygwh6"
2024-07-18T09:41:07.591Z info: [Crawler][17] Attempting to determine the content-type for the url https://www.baidu.com/
2024-07-18T09:41:07.683Z info: [Crawler][17] Content-type for the url https://www.baidu.com/ is "text/html"
2024-07-18T09:41:07.684Z error: [Crawler][17] Crawling job failed: AssertionError [ERR_ASSERTION]: undefined == true
2024-07-18T09:41:08.736Z info: [Crawler][17] Will crawl "https://www.baidu.com/" for link with id "atrqsg02v8ugw7fwehlygwh6"
2024-07-18T09:41:08.736Z info: [Crawler][17] Attempting to determine the content-type for the url https://www.baidu.com/
2024-07-18T09:41:09.860Z info: [Crawler][17] Content-type for the url https://www.baidu.com/ is "text/html"
2024-07-18T09:41:09.861Z error: [Crawler][17] Crawling job failed: AssertionError [ERR_ASSERTION]: undefined == true
2024-07-18T09:41:11.945Z info: [Crawler][17] Will crawl "https://www.baidu.com/" for link with id "atrqsg02v8ugw7fwehlygwh6"
2024-07-18T09:41:11.945Z info: [Crawler][17] Attempting to determine the content-type for the url https://www.baidu.com/
2024-07-18T09:41:12.025Z info: [Crawler][17] Content-type for the url https://www.baidu.com/ is "text/html"
2024-07-18T09:41:12.027Z error: [Crawler][17] Crawling job failed: AssertionError [ERR_ASSERTION]: undefined == true
2024-07-18T09:41:16.058Z info: [Crawler][17] Will crawl "https://www.baidu.com/" for link with id "atrqsg02v8ugw7fwehlygwh6"
2024-07-18T09:41:16.058Z info: [Crawler][17] Attempting to determine the content-type for the url https://www.baidu.com/
2024-07-18T09:41:16.149Z info: [Crawler][17] Content-type for the url https://www.baidu.com/ is "text/html"
2024-07-18T09:41:16.151Z error: [Crawler][17] Crawling job failed: AssertionError [ERR_ASSERTION]: undefined == true
2024-07-18T09:41:24.181Z info: [Crawler][17] Will crawl "https://www.baidu.com/" for link with id "atrqsg02v8ugw7fwehlygwh6"
2024-07-18T09:41:24.181Z info: [Crawler][17] Attempting to determine the content-type for the url https://www.baidu.com/
2024-07-18T09:41:24.271Z info: [Crawler][17] Content-type for the url https://www.baidu.com/ is "text/html"
2024-07-18T09:41:24.272Z error: [Crawler][17] Crawling job failed: AssertionError [ERR_ASSERTION]: undefined == true
Similiar with #331 ,issue closed.
The workers continue to output error information, and the crawler doesn't work.
1.Workers' log:
2.Chrome's log:
3.Docker compose file(totally the same with default, except with the port redirection):
4.Environment: