Open michaelhku2002 opened 4 years ago
Same issue! @Terrtia @mokaddem please help!
Hi @annetteshajan !
Can you please check the output of all the Splash crawlers screen -r Crawler_AIL
and if all the Splash docker are launched with sudo docker ps
this is what it says from screen -r Crawler_AIL There is no screen to be resumed matching Crawler_AIL. And there are no Splash dockers launched with sudo docker ps @Terrtia
You need to start all splash dockers and crawler:
sudo ./bin/torcrawler/launch_splash_crawler.sh -f <config absolute_path> -p <port_start> -n <number_of_splash>
With <port_start>
and <number_of_splash>
matching those specified at splash_onion_port
in the configuration file of point 3 (/configs/core.cfg
)For example: sudo ./bin/torcrawler/launch_splash_crawler.sh -f /home/myuser/ail-framework/configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 3
All Splash dockers are launched inside the Docker_Splash
screen. You can use sudo screen -r Docker_Splash
to connect to the screen session and check all Splash servers status.
./bin/LAUNCH.sh -c
(AILENV) shajan@annette-inspiron-5567:~/AIL-framework$ sudo ./bin/torcrawler/launch_splash_crawler.sh -f configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 1 There are several suitable screens on: 19016.Docker_Splash (Tuesday 02 June 2020 02:26:07 IST) (Detached) 12265.Docker_Splash (Tuesday 02 June 2020 12:34:08 IST) (Detached) 9890.Docker_Splash (Tuesday 02 June 2020 11:58:23 IST) (Detached) 9766.Docker_Splash (Tuesday 02 June 2020 11:57:30 IST) (Detached) 6489.Docker_Splash (Tuesday 02 June 2020 11:25:03 IST) (Detached) Use -S to specify a session. Splash server launched on port 8050 (AILENV) shajan@annette-inspiron-5567:~/AIL-framework$ screen -r Docker_Splash There is no screen to be resumed matching Docker_Splash.
My output is coming as this^ when i try to launch the splash servers @Terrtia
There might be an issue with the splash container,
Can you check the ouptut of one of those screen ? screen -r 19016
Same response There is no screen to be resumed matching 19016.
Is there something I haven't installed? @Terrtia
This screen is launched as root , can you please try with sudo screen -r 19016
It's just this screen which launches I tried sudo docker ps It shows no splash dockers launched and sudo screen -r Docker_Splash gives the same as above screenshot
We can try to manually launch one of the docker, Can you please give me the output of this cmd:
sudo docker run -d -p 8050:8050 --restart=always --cpus=1 --memory=2G -v /home/<myuser>/ail-framework/configs/docker/splash_onion/etc/splash/proxy-profiles/:/etc/splash/proxy-profiles/ --net="bridge" scrapinghub/splash --maxrss 1000
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. bbad1635d46d4c98a6c66224638f83a3e3efeecb7dbc1bdc274102e0034a071d @Terrtia if I use --memory-swap i get this: docker: Error response from daemon: driver failed programming external connectivity on endpoint adoring_davinci (367ac2ced9ecbe16f170625013426df15a607c08033fd899d893d5b6ab5ed327): Bind for 0.0.0.0:8050 failed: port is already allocated.
All splash dockers down here OH i got this sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bbad1635d46d scrapinghub/splash "python3 /app/bin/sp…" 21 minutes ago Up 14 minutes 0.0.0.0:8050->8050/tcp charming_aryabhata
Also wanted to know does pastebin collect onion addresses or do we have to provide?
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
You can ignore this warning.
if I use --memory-swap i get this: docker: Error response from daemon: driver failed programming external connectivity on endpoint adoring_davinci (367ac2ced9ecbe16f170625013426df15a607c08033fd899d893d5b6ab5ed327): Bind for 0.0.0.0:8050 failed: port is already allocated.
You get this error because the first splash docker is using this port (8050).
Also wanted to know does pastebin collect onion addresses or do we have to provide?
AIL extract and crawl (if the crawler is enabled) all the onion addresses that are inside any items. You can found some onions in pastebin but you might want to use other sources:
onion
keyword. I'm currently adding a new way to launch and kill all the dockers containers. I'll push it today or tomorrow.
@Terrtia I uploaded a file with onion addresses and all of them seem to be down How do I resolve this?
Here is the error I'm getting when i run crawler.py Is there any fix for this? @Terrtia @mokaddem
Dear all,
I have installed the framework according to https://github.com/CIRCL/AIL-framework/blob/master/HOWTO.md for enabling the crawler. However, even though I followed the steps (note: I have set Splash host = AIL host), and installed Splash seperately, I still find the crawling page like this
I have also tried to submit the domain (e.g. pastebin) to "Manual Crawler" but still have no result to be shown on it. So what is the problem why I can't really kick start the crawling functionality? Thanks!