danielmichaels / fuelwatcher

A simple XML scraper for fuelwatch.wa.gov.au fuel price data.
MIT License
5 stars 3 forks source link

Historical data #19

Open giovannifi opened 3 years ago

giovannifi commented 3 years ago

I noticed that I can get data that are not more than 7 days older. Previous dates return empty queries. Is this the correct behaviour and a limitation of the website too? many thanks Giovanni

danielmichaels commented 3 years ago

Hey thanks for the issue,

I have not played with this in a while. I know on the website you can get historical data but I am not sure that is possible via their RSS feed, which this API effectively just scrapes.

If you want to, you're welcome to have a play and raise a PR if you can get more data out of it. Unfortunately, right now I do not have the time to do much more.

giovannifi commented 3 years ago

Hi Daniel, thanks for your kind reply and your explanation. This now makes sense to me.

Indeed, I tried: https://www.fuelwatch.wa.gov.au/fuelwatch/fuelWatchRSS?Product=1&Day=24/02/2021 that actually returns some data. but if I try https://www.fuelwatch.wa.gov.au/fuelwatch/fuelWatchRSS?Product=1&Day=23/02/2021 or earlier, I do not get any data. I suppose that it is a problem with the website itself: the RSS info is available only for a week.

I found a database here: https://www.fuelwatch.wa.gov.au/fuelwatch/pages/public/historicalFileDownloadRetail.jspx#, but the data are available as a zip file for each day. I will try to see if I can find a way with python to download a specific day, extract the zip and read the data with a dataframe.

On Wed, 3 Mar 2021 at 10:13, Daniel Michaels notifications@github.com wrote:

Hey thanks for the issue,

I have not played with this in a while. I know on the website you can get historical data but I am not sure that is possible via their RSS feed, which this API effectively just scrapes.

If you want to, you're welcome to have a play and raise a PR if you can get more data out of it. Unfortunately, right now I do not have the time to do much more.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/danielmichaels/fuelwatcher/issues/19#issuecomment-789371995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH273DSLT2N7FMLSRAKVSX3TBWLL7ANCNFSM4YLZJW6A .