culturecreates / artsdata-planet-lavitrine

Export pipeline from Artsdata to LaVitrine
1 stars 0 forks source link

GTQ scrape HTML descriptions #21

Open saumier opened 11 months ago

saumier commented 11 months ago

The descriptions in the JSON-LD on GTQ do not have layout, and sometimes squish text from different paragraphs without spaces at the end of each paragraph.

Instead of using the JSON-LD, the HTML from the event description should be scraped.

See discussion https://github.com/culturecreates/artsdata-planet-lavitrine/discussions/19

Le problème est que les changements de ligne sont perdus dans la description. Par exemple, le passage ...sa génération.À propos de Chansons hivernalesChansons hivernales... aurait besoin de changements de ligne.