DaveParr / PyFest

0 stars 0 forks source link

Create a list of artists based on lineup poster #11

Closed DaveParr closed 1 year ago

DaveParr commented 1 year ago

Many festivals have a lineup poster. This image contains the artists information, but as image based text, which is potentially stylised to the band uniquely:

Slamdunk

Slamdunk produces multiple posters as they announce their lineup each year: https://www.google.com/search?rlz=1C5CHFA_enGB998GB998&sxsrf=ALiCzsargpPLT4ChN2CcnIKYvtW5hfj2zQ:1666196361041&source=univ&tbm=isch&q=slam+dunk+lineup&fir=m4mzivCecp2rAM%252CdY1wBjdkyP77EM%252C_%253B2EjozZ_DiGImJM%252CtEqRsjxVGkkC5M%252C_%253BTlt-Jk8ckpjs-M%252Cnbt8tIhYkGb4MM%252C_%253Bzc3W_ZbepIwoYM%252C3F6LUn42zY_ltM%252C_%253BdPWhQ_4Xl2W9xM%252CpArZMhCIl637HM%252C_%253BQw7dlY8xNIkkIM%252C4df90k62O6BoEM%252C_%253BezIxwBGIRkccPM%252Cemd-_b89Q-v5TM%252C_%253BCdBS8uWWeMTHcM%252CuvN1lnKh0AVswM%252C_%253BaZD-TKpdBn86_M%252C43LmYKsLXj_xJM%252C_%253B25Fd9_KFEilnyM%252CwVfTes1i-oAkzM%252C_&usg=AI4_-kQbvDkLMyZBWVdWZYKWvTPFbNPqLw&sa=X&ved=2ahUKEwiK8uaR2ez6AhXGUMAKHVK1DXYQ7Al6BAgKEFI&biw=1512&bih=865&dpr=2

This image based text data might be able to be processed by an existing image-to-text machine learning model.

DaveParr commented 1 year ago

easyocr package might be the quickest win here.

Hugging face has a community made hosted app that demos how it can be utilised: https://huggingface.co/spaces/Amrrs/image-to-text-app

Having put through a few lineup posters accuracy leaves much to be desired though might be useful in bulk. Potentially artists can be validated by a search against Wikipedia narrowed by artist page.

DaveParr commented 1 year ago

OS OCR tools outlined here: https://nanonets.com/blog/ocr-with-tesseract/#:~:text=Schedule%20a%20Demo-,Open%20Source%20OCR%20Tools,-There%20are%20a

chunswu commented 1 year ago

closed by #19