Closed ghost closed 5 years ago
So if you want to compare the two. Make sure they are in the same region. Different regions will have more or less peers which will slow things down.
On Thu, Jan 31, 2019 at 7:00 AM ash121121 notifications@github.com wrote:
Just after some advice. I don't seem to be collecting as many torrents as I did with the old scraper. its been 2-3 days and I've only collected under 1 million. The old scraper it was possible to reach 1 million in 24hrs.
Does scraping speed depend on location or host etc? I'm currently testing in Amsterdam on Digital Ocean.
would there be a better location/host?
kind regards.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/42, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZmsxlkUdHDhBi2C6zar40ziSm-kYks5vIujtgaJpZM4acQu6 .
Huge thanks to @ash121121 for helping me get my instance working 100% today after some hiccups. Awesome code @Prefinem !
uptime │ 14m | 1,880 TORRENTS
...Just chiming in to say it is quite a bit slower than I remember it from the last time I tested. Traffic graph: https://i.gyazo.com/d8ffbdf16f7117637e46dcfe4d258c64.png
Give it some time to spin up. When I tested this versus the new code, it was slightly slower, but the code is maintainable.
Alot slower for me. Code is much better as you say :) There anyway you can put some more horsepower into the scraper? :D
Not sure. Maybe. Need to look at why it’s slower.
On Thu, Jan 31, 2019 at 7:32 PM ash121121 notifications@github.com wrote:
Alot slower for me. Code is much better as you say :) There anyway you an put some more horsepower into the scraper? :D
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/42#issuecomment-459572698, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZpVm3jJnvABo365MnKsd1NDwCNtEks5vI5kUgaJpZM4acQu6 .
One thing I don't think any of us have taken In to account is the previous scraper, the torrents we wanted to filter out were just marked as inactive but we're still counted in the database. We're as this scraper the nasty torrents don't even enter the database to be counted. Maybe it looked so fast because of this.
That is a good point. I didn't think about that. We could always drop all the filters and check the speed
so I span up 3 servers at Digital ocean. 2 in Germany and one in NYC all pointing to the same Mysql and elastic database and I seem to be getting around 1k torrents per 4 Minutes
I'll test without filters
That is about 350K a day.
Seeing as there are only 30 million torrent's generally active on public trackers, that would take 3 months to max out
tested with no filters and same result if not slower :/
Remember to let it run for a bit. It has to get a lot of peers before torrents start really coming in
On Fri, Feb 1, 2019 at 5:19 PM ash121121 notifications@github.com wrote:
tested with no filters and same result if not slower :/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/42#issuecomment-459901526, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZo6pDDO_en0hGd5TRsMb4cRA9nGbks5vJMuPgaJpZM4acQu6 .
will do :) i found adding these to the config boosted speed by at least 2 fold.
{ address: 'router.utorrent.com', port: 6881 }, { address: 'router.bitcomet.net', port: 554 }, { address: 'dht.aelitis.com', port: 6881 },
adding these to the config boosted speed by at least 2 fold
Yes, definitely a bit faster here thanks @ash121121. Editing formats/tags really slows it down I think. I added 6x new formats and a couple tags and I got around 20k over the last hour w/ netin reaching around 3.5mbit - this is Intel(R) Xeon(R) CPU E3-1231 v3 @ 3.40GHz - 8GB VM w/ 4 cpu core + ssd.
Just after some advice. I don't seem to be collecting as many torrents as I did with the old scraper. its been 2-3 days and I've only collected under 1 million. The old scraper it was possible to reach 1 million in 24hrs.
Does scraping speed depend on location or host etc? I'm currently testing in Amsterdam on Digital Ocean.
would there be a better location/host?
kind regards.