internetarchive / Zeno

State-of-the-art web crawler 🔱
GNU Affero General Public License v3.0
83 stars 11 forks source link

Extract URLs from images #86

Open CorentinB opened 4 months ago

CorentinB commented 4 months ago

Would be interesting to try to do OCR on images (as an option) to extract URLs from watermark and such.

yzqzss commented 3 months ago

OCR might be slow and inaccurate, but how about extracting URLs from QR codes in images?

CorentinB commented 3 months ago

OCR might be slow and inaccurate, but how about extracting URLs from QR codes in images?

Very good idea. (not a priority though, maybe it should be another issue?)