stev-ou / review_ocr

Using Google's Tesseract OCR to extract data from public PDFs
GNU General Public License v3.0
0 stars 1 forks source link

External Store for PDFs, Data, etc #6

Open samjett247 opened 5 years ago

samjett247 commented 5 years ago

Would be smart to back up all of the pdfs that were scraped, that way the parsing could be repeated if needed and if the public data is ever taken down.

samjett247 commented 5 years ago

Dropbox, Google Drive, similar... Thoughts @schuermannator ?

zachschuermann commented 5 years ago

Yep, let's plan on storing in Google Cloud just so everything lives in the same place.

zachschuermann commented 5 years ago

We still fall into Google Cloud Storage free tier. I'll put a little something together for this

samjett247 commented 5 years ago

Note similar issue on Backend repo here. Deleting the issue in backend and we can talk about it here.