Open Irio opened 8 years ago
Holidays and events should be considered as they usually significantly the price of rooms.
Does anybody have an idea how to proceed with this scraping? I mean, in addition to what is already being done by @Lrcezimbra at #100.
I had a look into booking.com but couldn't find any suitable API. I also tried decolar.com (they do have a public and free API [1]), but their terms of usage doesn't seem to allow the kind of data scraping we need (I don't even know why I thought it would :smile:).
I don't believe there are historical databases for pricing. What could be done is to identify hotels on the database and start watching booking/expedia/... and scrape data, building Serenata's own dataset for that. Keep in mind that hotel pricing is somewhat complex, and database can become large.
From 2012 to now, housing pricing almost not changed.
On Sat, Jan 7, 2017 at 1:57 PM, Eduardo Bonet notifications@github.com wrote:
I don't believe there are historical databases for pricing. What could be done is to identify hotels on the database and start watching booking/expedia/... and scrape data, building Serenata's own dataset for that. Keep in mind that hotel pricing is somewhat complex, and database can become large.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/datasciencebr/serenata-de-amor/issues/26#issuecomment-271091583, or mute the thread https://github.com/notifications/unsubscribe-auth/AXulHVS84MnnKaZp_fvxYV4DrfOCWei0ks5rP7XSgaJpZM4JvAqG .
Closed accidentally by unrelated commit from Rosie/Jarbas repos.
Filtering quota's dataset by records with value 'Lodging, except for congressperson from Distrito Federal' in the column
subquota_description
will return many expenses made with hotels. We could match the value in the receipt against publicly available (through Booking.com, for instance) range of prices.