robbrad / UKBinCollectionData

UK Council Bin Collection Data Parser Outputting Bin Data as a JSON
MIT License
104 stars 73 forks source link

Gedling Borough Council (Street names with multiple collections) #757

Closed jamesmacwhite closed 1 week ago

jamesmacwhite commented 2 weeks ago

Name of Council

Gedling Borough Council

Issue Information

The current implementation of the Gedling Borough Council parser for stage one does not factor in street name searches which return multiple results.

Example street name values which do this are Oxclose Lane, Westdale Lane, Breck Hill Road.

These examples will return multiple rows in the refuse/recycling/garden waste collection table, so targeting the ID of the "Download Calendar" link with a fixed ID will not work reliably in this scenario, as it will only pick the first row, which might not be the right schedule for these areas.

Instead, querying the tables by their ID and then DOM parsing the tbody tr rows as an array and mapping the cell data as values works.

If it is useful, I have built an API deployed as a Cloudflare Worker, which is at: https://api.gbcbincalendars.co.uk, it queries https://apps.gedling.gov.uk/refuse/search.aspx but returns the data in JSON and will handle multiple rows as a JSON array. It takes a single query parameter of streetName.

Example queries:

https://api.gbcbincalendars.co.uk?streetName=Oxclose%20Lane
{
    "streetNameQuery": "Oxclose Lane",
    "refuseCollections": [
        {
            "Location": "Oxclose Lane (Nos 201-267)",
            "Area": "Arnold",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/MondayG2-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/monday-g2",
            "Schedule Identifier": "monday-g2",
            "Schedule Name": "Monday G2"
        },
        {
            "Location": "Oxclose Lane (Nos 37-143 & 100-198)",
            "Area": "Daybrook",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/TuesdayG1-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/tuesday-g1",
            "Schedule Identifier": "tuesday-g1",
            "Schedule Name": "Tuesday G1"
        }
    ],
    "gardenWasteCollections": [
        {
            "Location": "Oxclose Lane",
            "Numbers": null,
            "Area": "Arnold",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20D-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-d",
            "Schedule Identifier": "thursday-d",
            "Schedule Name": "Thursday D"
        },
        {
            "Location": "Oxclose Lane",
            "Numbers": null,
            "Area": "Daybrook",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20D-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-d",
            "Schedule Identifier": "thursday-d",
            "Schedule Name": "Thursday D"
        }
    ],
    "viewState": "uAJWFdR3Fmw+9x7iDjVCiKceh9sxZNFxcidT4nDU7r2YeKEa1Fk3YrOo944f1vtUfmVtyYpi8hk2xpfjXxQJ+80WwFpXs5He+x5VQNN8R+O4347DZXA6lzdjIQVfqH+BhWBfeMXuKtL6rIlGPoeMmLiBPhCqf9oWQq5fw2GyigyyyT/tCycvxHEOEp3tPnxL6X4OcTueJJt6hjJl7ieE24ChSZJ2TAaih1Hkz9V5oM1VbSG4oj/eX2sF0FZyF6/FO0KYB+/A46THPcWTu6yJvA==",
    "viewStateGenerator": "3260408D",
    "eventValidation": "Lk30i+StoYD9K0QY5ockQemZzLEuSQTgH1jYdTH6ivRaXOqmcZgoK7OYI5Fo/+mTGwPUZKh1U86rtgqEBsaJCCt5kaJA/qquqcaFf6Xz1EF4HQAk171jwqlnYB4VDyJAIFJF0+fGjd2UnwLXIA19Yg=="
}
 https://api.gbcbincalendars.co.uk?streetName=Westdale%20Lane
{
    "streetNameQuery": "Westdale Lane",
    "refuseCollections": [
        {
            "Location": "Westdale Lane East (152-166 even only)",
            "Area": "Gedling",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/FridayG3-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/friday-g3",
            "Schedule Identifier": "friday-g3",
            "Schedule Name": "Friday G3"
        },
        {
            "Location": "Westdale Lane East (Nos 1 - 275 odd and 2 - 150 &168 - 292A even) (152-166 put bins on Besecar Ave)",
            "Area": "Gedling/Carlton",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/ThursdayG2-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-g2",
            "Schedule Identifier": "thursday-g2",
            "Schedule Name": "Thursday G2"
        },
        {
            "Location": "Westdale Lane West (Nos 289 - 401 odd and 294 - 396 even)",
            "Area": "Mapperley",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/WednesdayG3-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/wednesday-g3",
            "Schedule Identifier": "wednesday-g3",
            "Schedule Name": "Wednesday G3"
        },
        {
            "Location": "Westdale Lane West (Nos 403 - 473 odd and 398 - 450 even)",
            "Area": "Mapperley",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/WednesdayG2-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/wednesday-g2",
            "Schedule Identifier": "wednesday-g2",
            "Schedule Name": "Wednesday G2"
        },
        {
            "Location": "Westmoore Court Westdale Lane",
            "Area": "Mapperley",
            "Calendar URL": "https://apps.gedling.gov.uk/refuse/data/WednesdayG2-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/wednesday-g2",
            "Schedule Identifier": "wednesday-g2",
            "Schedule Name": "Wednesday G2"
        }
    ],
    "gardenWasteCollections": [
        {
            "Location": "Westdale Lane East",
            "Numbers": null,
            "Area": "Carlton",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20I-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-i",
            "Schedule Identifier": "thursday-i",
            "Schedule Name": "Thursday I"
        },
        {
            "Location": "Westdale Lane East",
            "Numbers": null,
            "Area": "Gedling",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20I-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-i",
            "Schedule Identifier": "thursday-i",
            "Schedule Name": "Thursday I"
        },
        {
            "Location": "Westdale Lane West",
            "Numbers": null,
            "Area": "Gedling",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20F-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/monday-f",
            "Schedule Identifier": "monday-f",
            "Schedule Name": "Monday F"
        },
        {
            "Location": "Westdale Lane West",
            "Numbers": "300-300",
            "Area": "Gedling",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20I-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/thursday-i",
            "Schedule Identifier": "thursday-i",
            "Schedule Name": "Thursday I"
        },
        {
            "Location": "Westdale Lane West",
            "Numbers": "282-282",
            "Area": "Gedling",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20J-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/friday-j",
            "Schedule Identifier": "friday-j",
            "Schedule Name": "Friday J"
        },
        {
            "Location": "Westdale Lane West",
            "Numbers": null,
            "Area": "Mapperley",
            "Calendar URL": "https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20F-2024.pdf",
            "Email Subscribe URL": "https://pages.comms.gedling.gov.uk/pages/monday-f",
            "Schedule Identifier": "monday-f",
            "Schedule Name": "Monday F"
        }
    ],
    "viewState": "XmR5fOVMaeXQmOlJ/259N5xHnxtVtT5GBuHPlkxr+l62Hp7pSHDvSRTrwdEfcYSayslhhSK1S3MvSvF4Gqh/1mpBC+4kkJKrrUkohrzYDM0UPhJfATauWOMuRb5y8Vy7zTSJ4TBAz6uDWUH0ayWqkSqvf0Ht0i32j1o5tC++U/HEdHg+PaRlxIIRK9QeaG6KXEds9y4VtxJVX/O/vr/yWauPMuI9rUXWEdL3R6NGZ5D0W0EfTNgXvDGD+w658PiGup5ZIZ3Hs2YaqmHT3S+keA==",
    "viewStateGenerator": "3260408D",
    "eventValidation": "AmrlKCS2rj3T9TNHG+pO1GbdYs2jxWdkznLlskIjo2aD/ShL7rReEcx9bEBlYcppxgCOSqXaVcom5MVtsIIaQFTzkQZ1Qx76WRX40Y6lH97euS9e+stFTWN9z3IkrkNPj7jrfwIMwL0roVU0ir9pMw=="
}

I built this for a front end search that provides the specific collection name as part of gbcbincalendars.co.uk, but it's an open API with CORS headers set, given its technically just a proxy for Gedling's geriatric refuse search.

Feel free to use it in this project, the intention is to have reliable JSON data for the bin collection data for Gedling.

Verification

dp247 commented 1 week ago

Oooooh, I love this! You're doing the work of a saint for Gedling!

jamesmacwhite commented 1 week ago

Thanks! The street name search for Refuse Collection Days from Gedling is weird to be honest. It appears to do partial matching on the "Location" and "No's" values columns in the database table. For example, a search of simply "1" would return results, because in some cases specific numbers are used in collection row data, but you'll get loads of results. The other issue which I haven't handled is paginated responses. Really wide queries will return results with pagination, I may look at trying to handle this for the front end built, but for now, I have set validation rules under the API for the search term to be a minimum of a few things, just to try and avoid that for now.

The reason for the API was to build an alternative search front end, but I just ended up opening up the API if anyone wants to use it outside of the original website. The irony of the origin search is doesn't directly state the collection weekday or schedule name, the only place it exists is on the email subscribe URL, parsing out the final URL part e.g. monday-g1.