Closed roberthunt closed 10 months ago
New calendars.
MondayG1.pdf MondayG2.pdf MondayG3.pdf MondayG4.pdf TuesdayG1.pdf TuesdayG2.pdf TuesdayG3.pdf TuesdayG4.pdf WednesdayG1.pdf WednesdayG2.pdf WednesdayG3.pdf WednesdayG4.pdf ThursdayG1.pdf ThursdayG2.pdf ThursdayG3.pdf ThursdayG4.pdf FridayG1.pdf FridayG2.pdf FridayG3.pdf FridayG4.pdf
Garden Waste A.pdf Garden Waste B.pdf Garden Waste C.pdf Garden Waste D.pdf Garden Waste E.pdf Garden Waste F.pdf Garden Waste G.pdf Garden Waste H.pdf Garden Waste I.pdf Garden Waste J.pdf
If this one can be scraped and included in this project, I'll be over the moon.
I've been battling with this nonsensical way of this data being provided and have asked Gedling on multiple occasions to either provide a simple list of dates or, better yet, an API for it but to no avail.
No promises but... have you got the URLs for those calendars 😉
Yes, keep in mind they seem to re-use the URLs from year to year so they have recently swapped over to delivering the 2023/2024 calendar now. The files for last year are above though by way of reference in how they may change.
https://apps.gedling.gov.uk/refuse/data/MondayG1.pdf https://apps.gedling.gov.uk/refuse/data/MondayG2.pdf https://apps.gedling.gov.uk/refuse/data/MondayG3.pdf https://apps.gedling.gov.uk/refuse/data/MondayG4.pdf https://apps.gedling.gov.uk/refuse/data/TuesdayG1.pdf https://apps.gedling.gov.uk/refuse/data/TuesdayG2.pdf https://apps.gedling.gov.uk/refuse/data/TuesdayG3.pdf https://apps.gedling.gov.uk/refuse/data/TuesdayG4.pdf https://apps.gedling.gov.uk/refuse/data/WednesdayG1.pdf https://apps.gedling.gov.uk/refuse/data/WednesdayG2.pdf https://apps.gedling.gov.uk/refuse/data/WednesdayG3.pdf https://apps.gedling.gov.uk/refuse/data/WednesdayG4.pdf https://apps.gedling.gov.uk/refuse/data/ThursdayG1.pdf https://apps.gedling.gov.uk/refuse/data/ThursdayG2.pdf https://apps.gedling.gov.uk/refuse/data/ThursdayG3.pdf https://apps.gedling.gov.uk/refuse/data/ThursdayG4.pdf https://apps.gedling.gov.uk/refuse/data/FridayG1.pdf https://apps.gedling.gov.uk/refuse/data/FridayG2.pdf https://apps.gedling.gov.uk/refuse/data/FridayG3.pdf https://apps.gedling.gov.uk/refuse/data/FridayG4.pdf
https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20A.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20B.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20C.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20D.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20E.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20F.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20G.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20H.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20I.pdf https://apps.gedling.gov.uk/GDW/Rounds/data/Garden%20Waste%20J.pdf
Cheers. I've also sent an FOI request to the council for their data, so we may have two ways to go about it.
Cheers. I've also sent an FOI request to the council for their data, so we may have two ways to go about it.
That is genius! 😂
I got a response... they sent me PDF files
The joy of dealing with Gedling. 😂
When asked about API access, following their email alerts recently falling over and either sending people notifications for the wrong bin to be collected (even different bins to different individuals subscribed from the same house 🤦🏼♂️) or no email at all they've said
we are looking at options including an easier interface but, for now, we will continue with the email alerts, we're just having a few issues since we moved to a new system.
The joy of dealing with Gedling. 😂
When asked about API access, following their email alerts recently falling over and either sending people notifications for the wrong bin to be collected (even different bins to different individuals subscribed from the same house 🤦🏼♂️) or no email at all they've said
we are looking at options including an easier interface but, for now, we will continue with the email alerts, we're just having a few issues since we moved to a new system.
Tell them they are welcome to open a pull request on this GitHub repository as an option.
If we do decide to do something funky with the PDFs - please keep in mind
https://github.com/robbrad/UKBinCollectionData/issues/493#issuecomment-1859126983
I know it's not very 'smart' but would it be a half way house if we were just to hard code the data? You could still use the address lookup to check the right lookup data. It would mean once a year someone would have to grab the data and put it in a sensible format so someone can submit the changes. I'm happy to write the first version up - @roberthunt would you be happy checking in on this yearly to update the data or create a request for someone else to do it?
I know it's non ideal, but anything else seems like it'll either take much longer or not happen at all. And at least it gives the poor folk of Gedling HA interation?
@robbrad ?
I'm okay with that. I know it's less than ideal, but can it be a JSON dictionary in the Python council file? The only reason I say this is if we start having extra files in the repository, it dilutes the structure we currently have. What do you think, @skelt0 ?
Yeah ok! I'll see what I can pull together!
This may or may not help you get the data out https://github.com/pymupdf/PyMuPDF
Other option if there is someway to go PDF to HTML then extract the data rather than hand typing it
@robbrad - Check out an initial stab at this: https://github.com/skelt0/UKBinCollectionData/blob/feat-gedling-borough-council/uk_bin_collection/uk_bin_collection/councils/GedlingBoroughCouncil.py
The calendar data is pretty predictable as mentioned somewhere above so i've made a helper script to generate the dates based on three values. It makes it a million times quicker. I wouldn't like to predict that this predictive modelling will work in the future years though (even/odd weeks, and 1 in 4 for glass) so 50/50 on whether I save the helper script somewhere.
Let me know what you think and I can continue putting the data in. Currently the link above works for the supplied street's refuse data (Black, Blue and Glass bins).
Note: This isn't tidied up yet and the address is currently hardcoded.
Looking good!
@roberthunt - can you let me know how you get on with this? It's hand entered - i've tried to match all the changes due to bank holidays.
Also - I was wondering if the FoI process could be repeated - but asking for an accessible (for screen reader) version of the data? Surely they must need to supply this data in an accessible format when requested?
Anyway - hope this works out!
If it helps, I've converted the horrible PDFs into the iCal format and hosted them for use, as I already did this with my own schedule Wednesday G2. The schedules generally follow a consistent schedule with the exception of bank holidays being identified as changed collection days.
https://github.com/jamesmacwhite/gedling-borough-council-bin-calendars
If you want to argue the case on legal grounds, all councils fall under the Public Sector Bodies Websites and Mobile Applications (No. 2) Accessibility Regulations 2018 act, they are legally required to make content accessible. The fact the calendars provided were created after 2018, would mean they would be required to provide an alternative format. If you want to push the issue, they are technically not meeting accessibility regulations with the formats provided.
James, you're a superstar! Thanks for doing that and for sharing.
No worries! It was great to come across this project and that it exists to create an API layer when there is none. Unfortunately for Gedling Borough Council, the PDFs are the only data source outside of the email reminder service, but while the email service is better accessibility wise given it's HTML, this does not provide full schedule data, so it's either PDF or nothing, which is horrible and borderline on their accessibility statement as referenced.
You could in theory trigger an automation on the email reminder being received and parse out the data from that. The consistent properties like the sender from or subject are available.
From: GBC Bin Reminder Alert <news@comms.gedling.gov.uk>
Subject: We're collecting your bin tomorrow, please it out by 6am
The heading which contains the bin type is under a <h2>
element but does not have a specific ID, there are also two <h2>
elements, so you'd have to take the first occurrence and then parse our the all caps text as that's what they use for bin type.
I looked at this orginally, but by the time you've looked at the email automation/HTML scraping side of things with the fragile nature of DOM/HTML parsing and the fact the Garden Waste Collection service is completely outside of this, just converting all to iCal seems easier and at least reliable, providing the occurrence scheduling aligns to the original PDF, so that's what I ended up doing after seeing a few others around home automation having the same issues with Gedling. Who knew Gedling has 20 different bin collections!
We should still push Gedling Borough Council to look at this though long term, the PDFs themselves have and always will be print documents, which Gedling won't actually print anymore anyway due to cost/sustainability, so the format in my view is outdated. Clearly, if the email reminder service exists, they have some form of scheduling system behind the scenes, so it doesn't seem to far to publish official iCal calendars.
I've also mocked up a web page with all the iCal links for easy reference as well: https://jamesmacwhite.github.io/gedling-borough-council-bin-calendars/. I'm not going to go as far as buy a domain name for the site, but a static Jekyll site should make it easier, rather than messing around with the Raw button on GitHub.
Love that! Thanks again for your work on this, made my life a lot easier.
You're welcome. HTML and JSON formats are also provided, making the data more accessible and open!
Since #763 was merged, this project now leverages API data from gbcbincalendars.co.uk removing the static issue. There is still the requirement to create iCal data for each calendar each year, but this should have a lower maintenance burden, given using calendar occurrences, allows this to be done without individually listing every single date occurrence manually. The JSON data is expanded to provide the collection dates in full, which is generated from RRULE iCal data.
Do we need to capture this process in the wiki at all?
And may I say, fabulous work @jamesmacwhite
Thanks. Glad it can be of use to other projects!
One thing for your wiki you might want to highlight. There's at least one case where a valid street name only returns data for one type of collection and not both. Odd right? Not sure how that's valid to be honest. I doubled checked this at the source and confirmed it's an oddity with Gedling's data.
Using Beswick Close
as the example.
No refuse data is returned, yet it does have garden collection data.
I've confirmed Beswick Close is within the Gedling boundary, but that's not really a surprise when clearly you can have a garden collection calendar!
I happened to come across this as there's some Google Analytics tracking on searches, and I cross check some searches locally just to ensure they are returning data correctly and this is one that discovered this kind of scenario is possible. More Gedling fun. I updated my own search tool to handle the scenario. The API response of an empty array for collections with no data is valid, but I guess I never expected this to occur for just one type.
My suspicion is that it's due to being a relatively new built area in the past two years it could possibly be a data lag.
Name of Council
Gedling Borough Council
Example Address/Postcode
Valeside Gardens
Additional Information
This one may be quite challenging, some facts:
Ideas
2022 / 2023
https://apps.gedling.gov.uk/refuse/search.aspx
Household (Black Bin) / Glass (Green Box) / Recycling (Green Bin)
At a glance, the G number seems to correlate to the week that the glass collection occurs (glass + recycling), starting in the first month (December). So WednesdayG3 would have glass collection 3rd week Dec 2022.
Collection pattern is [Household -> Recycling -> Household -> Recycling/Glass]
MondayG1.pdf MondayG2.pdf MondayG3.pdf MondayG4.pdf TuesdayG1.pdf TuesdayG2.pdf TuesdayG3.pdf TuesdayG4.pdf WednesdayG1.pdf WednesdayG2.pdf WednesdayG3.pdf WednesdayG4.pdf ThursdayG1.pdf ThursdayG2.pdf ThursdayG3.pdf ThursdayG4.pdf FridayG1.pdf FridayG2.pdf FridayG3.pdf FridayG4.pdf
Garden Waste (Brown Bin)
Garden Waste A.pdf Garden Waste B.pdf Garden Waste C.pdf Garden Waste D.pdf Garden Waste E.pdf Garden Waste F.pdf Garden Waste G.pdf Garden Waste H.pdf Garden Waste I.pdf Garden Waste J.pdf
Verification