planningalerts-scrapers / issues

Only for keeping track of all issues related to scraping
0 stars 0 forks source link

South Australia (SA) Planning Portal should not include html entities in the description field #513

Open softgrow opened 2 years ago

softgrow commented 2 years ago

I keep seeing \… in the description of applications. This is confusing for the ultimate reader of the application as they don't know about HTML entities and assume instead something about helicopters, landing pads maybe? Which is actually a bit of a concern in SA for larger developments (long story).

To resolve this issue, the planning alerts application should not display HTML entities. A solution to this would be for the scraper should pass through the text unaltered or give the entirety of the description. (This former assumes the planning alerts application is happy to consume unicode).

So a worked example. https://www.planningalerts.org.au/applications/2469922 has a description

Construction of a mixed use building comprising 3 residential towers (2 x 13 storeys and 1 x 15 storeys), retail and commercial tenancies on the ground, first and second floors, 2.5 levels of basement carparking with loading and servicing areas and p\…

Looking at the authority site https://plan.sa.gov.au/development_application_register#view-22014456-DAP we see

Description Construction of a mixed use building comprising 3 residential towers (2 x 13 storeys and 1 x 15 storeys), retail and commercial tenancies on the ground, first and second floors, 2.5 levels of basement carparking with loading and servicing areas and p... Show more.

If we click on Show more. it expands to:

Description Construction of a mixed use building comprising 3 residential towers (2 x 13 storeys and 1 x 15 storeys), retail and commercial tenancies on the ground, first and second floors, 2.5 levels of basement carparking with loading and servicing areas and publicly accessible outdoor terrace on the second floor.

Ideally the scraper should return all of the description e.g.

Construction of a mixed use building comprising 3 residential towers (2 x 13 storeys and 1 x 15 storeys), retail and commercial tenancies on the ground, first and second floors, 2.5 levels of basement carparking with loading and servicing areas and publicly accessible outdoor terrace on the second floor.

or failing that use a horizontal eclipse and return

Construction of a mixed use building comprising 3 residential towers (2 x 13 storeys and 1 x 15 storeys), retail and commercial tenancies on the ground, first and second floors, 2.5 levels of basement carparking with loading and servicing areas and p...

Both of these approaches would remove HTML entities from the person viewing the application.

katska commented 1 month ago

Missive conversation: https://mail.missiveapp.com/#inbox/conversations/3820d053-199a-46d8-8bef-55362e80b02c