GrafeasGroup / blossom

The website. The app. The everything.
6 stars 3 forks source link

Handle Imgur albums #92

Open TheLonelyGhost opened 7 years ago

TheLonelyGhost commented 7 years ago

Scenario

A post on /r/TranscribersOfReddit appears containing a link to an Imgur album.

Current Result

OCR bot skips the album (it only transcribes if it is a direct link to an image)

Expected Result

OCR bot pulls each image from the album and transcribes them

Workaround

None

Proposed fix

Have OCR bot detect if the link is of type text/hml on the imgur domain and, if so, scrape the page for each image, deferring to existing transcribing functionality from there.

perryprog commented 6 years ago

Check out the Imgur API, it’s what I was considering in ocr.space PR.

TheLonelyGhost commented 6 years ago

Oh neat! Looks like the API doesn't require an API key if it's just accessing public data. Will definitely have to play with it to confirm

perryprog commented 6 years ago

Wait, you don't? I'm pretty sure it says you need authorization to do a GET on a album. I'll do some testing I suppose.

perryprog commented 6 years ago

Alright, I'll take this issue I suppose. Seems basic enough. I can start working on it for now, but it'll be helpful to get GrafeasGroup/tor_ocr#10 closed.

TheLonelyGhost commented 6 years ago

Oops, I guess I misinterpreted that part of the docs.

NOTE: If your app is not only requesting public read-only information, then you may skip this step.

Looks like it actually meant one could skip the refresh token flow if working only with publicly accessible data, not that they could forgo API keys altogether.

codingJWilliams commented 6 years ago

Yeah you need an API key. I'll have a look and see if I can figure something out tonight. Imgur API keys are easy to get iirc.

perryprog commented 6 years ago

I already have an API key and image downloading, the only thing I had left to do was posting the comments.

codingJWilliams commented 6 years ago

Oh okay, look forward to seeing that then