Granitosaurus / ggmt

esport match ticker based on gosugamers.com data
GNU General Public License v3.0
6 stars 0 forks source link

Cloudflare bot protection is up, possible solutions. #3

Open Granitosaurus opened 6 years ago

Granitosaurus commented 6 years ago

So cloudflare bot protection popped up today on gosugamers.net which is being used as source for match data.

There are few solutions to this:

  1. Drop gosugamers as a source - honestly gosugamers has been pretty questionable quality as a source. The only benefit to some alternatives is that it supports a lot of games.
  2. Bypass bot protection - cloudflare bot protection can be solved with javascript; e.g. https://github.com/Anorov/cloudflare-scrape#integration The down-side to this is that ggmt package will require nodejs runtime as a dependancy :(
  3. Wait and or contact gosugamers for official support. Since ggmt is not a harmful tool maybe gosugamers would drop protection or provide some other access. It might be that cloudflare protection is only there temporarily

Eitherway I'll work on a branch with #\2 solution and look into #\1 however I'm having low expectations for #\3

Granitosaurus commented 6 years ago

Seems like #\2 requires a 5 second sleep which would render the app close to unusable. This could probably be cached somewhere but it increases the implementation complexity.

Granitosaurus commented 6 years ago

I've reached out to gosugamers via an email:

On Thursday, March 29th 2018, 1:53:10 pm tinarg tinarg@pm.me wrote: Hello, I've made few apps for linux that embeds gosugamers data to terminal and/or other mediums that aren't web browsers. Today cloudflare kinda killed every app.

Maybe you'd be willing to provide some API or something for unofficial gosugamer apps?

Dear Sir,

We implemented the changes due to heavy load of scrapers slowing down the site. If you wish to use our API or any of our data this is a commercial product that you can obtain towards a fee.

Best Regards

William Lövqvist support@gosugamers.net

So it seems like they provide some sort of api for a fee. I don't think they care about floss or any open projects that use their public data.

I've played around with #\2 solution and it's possible to use it in ggmt by stashing the cookies locally and taking the 5 second delay hit once in a while.