AlphaReign / scraper

AlphaReigns DHT Scraper, includes peer updater and categorizer
MIT License
127 stars 35 forks source link

High CPU #28

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hello me again :D .. just wondering if you ever had this issue or maybe i have something configured wrong. i have elastic search database with 12,365,592 torrents. and the scraper (old one) running on a seperate machine. for some reason when i run the scraper i see the machine with elasticsearch hitting my load limits of the 8 core server exceeding 20 on the load..

without the scraper running my load is around 3-4.

kind regards

Raxvis commented 6 years ago

So this is an issue I ran into when I was running the site. The problem is the high ingestion rate into elasticsearch. You can try and change the limit on when it decides to send the torrents to elasticsearch (1,000 to 10,000) and see if that reduces any pressure on the CPU

ghost commented 6 years ago

Thanks for getting back to me. Let me try that and see . would this be batch size setting in config.js?

i have noticed one thing that could be my fault from the very begining. i set peer age to "1" so im guessing its constantly updating the seeders info? i also just changed scrape frequency from 1 to 10 and noticed a big drop in cpu

Raxvis commented 6 years ago

Oh, yeah... You should probably drop that to like every 30 minutes or something

ghost commented 6 years ago

when scrape frequency is set to 1 it just never ending updating peers :D i guess all those millions of torrents its trying to update LOL

Raxvis commented 6 years ago

Yup

ghost commented 6 years ago

Is there anything you can reccomend i can do? You mentioned changing the limit to send the torrents to elastic. where can i find this setting? thanks for your help.

kind regards