CIRCL / AIL-framework

AIL framework - Analysis Information Leak framework. Project moved to https://github.com/ail-project
https://github.com/ail-project/ail-framework
GNU Affero General Public License v3.0
1.3k stars 282 forks source link

Cannot see any Crawled screen in Web Interface #413

Open michaelhku2002 opened 4 years ago

michaelhku2002 commented 4 years ago

Dear all,

I have installed the framework according to https://github.com/CIRCL/AIL-framework/blob/master/HOWTO.md for enabling the crawler. However, even though I followed the steps (note: I have set Splash host = AIL host), and installed Splash seperately, I still find the crawling page like this image

I have also tried to submit the domain (e.g. pastebin) to "Manual Crawler" but still have no result to be shown on it. So what is the problem why I can't really kick start the crawling functionality? Thanks!

annetteshajan commented 4 years ago

Same issue! @Terrtia @mokaddem please help!

Terrtia commented 4 years ago

Hi @annetteshajan !

Can you please check the output of all the Splash crawlers screen -r Crawler_AIL and if all the Splash docker are launched with sudo docker ps

annetteshajan commented 4 years ago

this is what it says from screen -r Crawler_AIL There is no screen to be resumed matching Crawler_AIL. And there are no Splash dockers launched with sudo docker ps @Terrtia

Terrtia commented 4 years ago

You need to start all splash dockers and crawler:

For example: sudo ./bin/torcrawler/launch_splash_crawler.sh -f /home/myuser/ail-framework/configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 3

All Splash dockers are launched inside the Docker_Splash screen. You can use sudo screen -r Docker_Splash to connect to the screen session and check all Splash servers status.

annetteshajan commented 4 years ago

(AILENV) shajan@annette-inspiron-5567:~/AIL-framework$ sudo ./bin/torcrawler/launch_splash_crawler.sh -f configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 1 There are several suitable screens on: 19016.Docker_Splash (Tuesday 02 June 2020 02:26:07 IST) (Detached) 12265.Docker_Splash (Tuesday 02 June 2020 12:34:08 IST) (Detached) 9890.Docker_Splash (Tuesday 02 June 2020 11:58:23 IST) (Detached) 9766.Docker_Splash (Tuesday 02 June 2020 11:57:30 IST) (Detached) 6489.Docker_Splash (Tuesday 02 June 2020 11:25:03 IST) (Detached) Use -S to specify a session. Splash server launched on port 8050 (AILENV) shajan@annette-inspiron-5567:~/AIL-framework$ screen -r Docker_Splash There is no screen to be resumed matching Docker_Splash.

annetteshajan commented 4 years ago

My output is coming as this^ when i try to launch the splash servers @Terrtia

Terrtia commented 4 years ago

There might be an issue with the splash container, Can you check the ouptut of one of those screen ? screen -r 19016

annetteshajan commented 4 years ago

Same response There is no screen to be resumed matching 19016.

annetteshajan commented 4 years ago

Is there something I haven't installed? @Terrtia

Terrtia commented 4 years ago

This screen is launched as root , can you please try with sudo screen -r 19016

annetteshajan commented 4 years ago

Screenshot from 2020-06-02 14-51-12 It's just this screen which launches I tried sudo docker ps It shows no splash dockers launched and sudo screen -r Docker_Splash gives the same as above screenshot

Terrtia commented 4 years ago

We can try to manually launch one of the docker, Can you please give me the output of this cmd:

sudo docker run -d -p 8050:8050 --restart=always --cpus=1 --memory=2G -v /home/<myuser>/ail-framework/configs/docker/splash_onion/etc/splash/proxy-profiles/:/etc/splash/proxy-profiles/ --net="bridge" scrapinghub/splash --maxrss 1000
annetteshajan commented 4 years ago

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. bbad1635d46d4c98a6c66224638f83a3e3efeecb7dbc1bdc274102e0034a071d @Terrtia if I use --memory-swap i get this: docker: Error response from daemon: driver failed programming external connectivity on endpoint adoring_davinci (367ac2ced9ecbe16f170625013426df15a607c08033fd899d893d5b6ab5ed327): Bind for 0.0.0.0:8050 failed: port is already allocated.

annetteshajan commented 4 years ago

Screenshot from 2020-06-02 16-56-02 All splash dockers down here OH i got this sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bbad1635d46d scrapinghub/splash "python3 /app/bin/sp…" 21 minutes ago Up 14 minutes 0.0.0.0:8050->8050/tcp charming_aryabhata

annetteshajan commented 4 years ago

Also wanted to know does pastebin collect onion addresses or do we have to provide?

Terrtia commented 4 years ago

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

You can ignore this warning.

if I use --memory-swap i get this: docker: Error response from daemon: driver failed programming external connectivity on endpoint adoring_davinci (367ac2ced9ecbe16f170625013426df15a607c08033fd899d893d5b6ab5ed327): Bind for 0.0.0.0:8050 failed: port is already allocated.

You get this error because the first splash docker is using this port (8050).

Also wanted to know does pastebin collect onion addresses or do we have to provide?

AIL extract and crawl (if the crawler is enabled) all the onion addresses that are inside any items. You can found some onions in pastebin but you might want to use other sources:

I'm currently adding a new way to launch and kill all the dockers containers. I'll push it today or tomorrow.

annetteshajan commented 4 years ago

@Terrtia I uploaded a file with onion addresses and all of them seem to be down How do I resolve this?

annetteshajan commented 4 years ago

Screenshot from 2020-06-06 14-36-59 Here is the error I'm getting when i run crawler.py Is there any fix for this? @Terrtia @mokaddem