AlphaReign / scraper

AlphaReigns DHT Scraper, includes peer updater and categorizer
MIT License
127 stars 35 forks source link

Miltiple scrapers #14

Closed ghost closed 6 years ago

ghost commented 6 years ago

hi. is it possible to have multiple instances running of the scraper in different regions, pointing to the same elastic DB. will there be issues with duplicates torrents.

kind regards

Raxvis commented 6 years ago

Nope, that works just fine. Elasticsearch handles the duplicates

On Thu, Mar 22, 2018 at 1:54 PM, ash121121 notifications@github.com wrote:

hi. is it possible to have multiple instances running of the scraper in different regions, pointing to the same elastic DB. will there be issues with duplicates torrents.

kind regards

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/14, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZoQgoi7LMWdwFMFJE18bWhxNwb_3ks5tg_OBgaJpZM4S3nFU .

ghost commented 6 years ago

Thanks mate. One last question if possible. I'm seeing some bad stuff end up on the site from DHT like underage porn etc.. What's your thoughts on a way to stop this? I was thinking of using your categorized.js and use keywords to make torrent inactive.

Kind regards

Raxvis commented 6 years ago

categorized.js would be the perfect place for it.

On Fri, Mar 23, 2018 at 8:55 AM, ash121121 notifications@github.com wrote:

Thanks mate. One last question if possible. I'm seeing some bad stuff end up on the site from DHT like underage porn etc.. What's your thoughts on a way to stop this? I was thinking of using your categorized.js and use keywords to make torrent inactive.

Kind regards

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/14#issuecomment-375672411, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZnYznSspXHedB_EQrZW5VaUEt-FAks5thP7RgaJpZM4S3nFU .

ghost commented 6 years ago

Thanks for that info mate. ive added some stuff in. Now just to figure out how to delete all that bad stuff thats all ready entered the DB.. there a way to run catagorize.js for the whole db or is it only doing it as new torrents enter?

kind regards

Raxvis commented 6 years ago

You could probably create an elasticsearch query that deletes the records that match a certain query. The categorize.js only works on new torrents coming in. It may work on ones that are re-indexed as well

On Fri, Mar 23, 2018 at 2:00 PM, ash121121 notifications@github.com wrote:

Thanks for that info mate. ive added some stuff in. Now just to figure out how to delete all that bad stuff thats all ready entered the DB.. there a way to run catagorize.js for the whole db or is it only doing it as new torrents enter?

kind regards

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/14#issuecomment-375767802, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZq2vLAYa5mH3Es666jPpqzKNTimUks5thUY-gaJpZM4S3nFU .

ghost commented 6 years ago

Thanks matte appreciate your input. The shit that comes from DHT is crazy .. like if you don't filter stuff as a public site your screwed . You could get done for facilitating This bad stuff ( won't name it but u No) anyways https://skytorrents.lol is powered by AR with a lot of mods. Thanks for input

ghost commented 6 years ago

Maybe I'm missing something but I added a keyword into the inactive array of categorized.js but torrents with the keyword still in the DB .

ghost commented 6 years ago

Maybe I need to study categorized.js maybe it's not matching the keywords properly. I suck at js but what I'm trying to achieve is like in php with stripos

Raxvis commented 6 years ago

indexOf is the function you are like looking for.

On Fri, Mar 23, 2018 at 19:42 ash121121 notifications@github.com wrote:

Maybe I need to study categorized.js maybe it's not matching the keywords properly. I suck at js but what I'm trying to achieve is like in php with stripos

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/14#issuecomment-375833030, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZpjMVLjqf-0O_OgSyCCbLV5tXCQUks5thZaAgaJpZM4S3nFU .

ghost commented 6 years ago

Thanks looks like its working :) i used tolowercase.indexOf ... thanks again mate

Raxvis commented 6 years ago

Awesome!

On Wed, Mar 28, 2018 at 5:41 AM, ash121121 notifications@github.com wrote:

Thanks looks like its working :) i used tolowercase.indexOf ... thanks again mate

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AlphaReign/scraper/issues/14#issuecomment-376842010, or mute the thread https://github.com/notifications/unsubscribe-auth/ACnvZuktAQvvVCe6U3K3HOa7ahyFSuZaks5ti2i8gaJpZM4S3nFU .