juliuste / bahn.guru

Deutsche Bahn ticket price calendar.
https://bahn.guru
ISC License
367 stars 22 forks source link

Use median instead of minimum price #12

Open benjaminweb opened 6 years ago

benjaminweb commented 6 years ago

Minimum price might mislead. See a month where most days show 19 EUR. Some days carry most fares at 19 EUR. Other days carry most fares a multiple of 19 EUR (yet there is a single weird connection of 19 EUR).

Just my bits…

juliuste commented 6 years ago

I'm sorry for not answering this in 4 months 🙈 In theory, I agree with your point, but since people are often looking for the lowest possible price, displaying the median price probably wouldn't help them. But it might make sense to have some other form of sorting, e.g. considering not only the price but the price per travel time (price maybe squared). I will think about this again.

derhuerst commented 6 years ago

Why not show a graph per day, with a very low opacity, behind the lowest price?

juliuste commented 6 years ago

Which information should this graph display?

derhuerst commented 6 years ago

A distribution of the prices of all tickets available for that day.

Also the development of the minimum price in the past days would be interesting.

benjaminweb commented 6 years ago

A distribution of the prices of all tickets available for that day. Sparklines? Simpler variant (kind of histogram with only 3 bars): Q1 (25 % quantile), Q2 (= Median), Q3 (75 % quantile) would serve that purpose, cf. 1. Also the development of the minimum price in the past days would be interesting.

  • X% (superscript to Q1 through Q3) would denote the markup in the last 7 days (or choose a sensible period). Will this be the main decision criterion for the user? If yes, then there is nothing more important than this.

Price surge happens within days to departure — for weeks before the price remains stable.Why not vary time granularity of price display accordingly?

Just an idea: "Book no later than … days before travel to escape price inrease of x%."

thigg commented 5 years ago

I fiddled for me something together which looks like this: https://github.com/thigg/bahn.guru/commit/9f2a7766bdfe08fd48eeca9d0d8685226d95633d#commitcomment-30855700

I would submit a pull request, if you are interested. For each day, the cheapest price at each hour is shown as a graph.

image

benjaminweb commented 5 years ago

First trial balloon on bahn.jetzt.

@juliuste Is it okay to trigger concurrent requests on bahn.guru as implemented now?? Requesting permission ;-).

screen shot 2018-11-11 at 18 06 09

screen shot 2018-11-11 at 19 52 52

juliuste commented 5 years ago

@thigg Thank you very much for your work, I'm sorry that I didn't see your answer before. @derhuerst is this what you had in mind?

juliuste commented 5 years ago

@benjaminweb Looking great 😮 Feel free to PR

benjaminweb commented 5 years ago

Oh, sorry. Missed to link to the repo https://bitbucket.org/hyllos/bahn_preis_vtl. Any idea how to integrate this python script into JS or vice versa?

thigg commented 5 years ago

@benjaminweb can you explain what the first graphic is depicting? I understand it as percent of prices in that pricerange.

We have quite different approaches now, maybe we should think about, how they can be added to the overall UI.

@juliuste

benjaminweb commented 5 years ago

@thigg It shows how the share of fares falling into a price bracket, that for the specific date of your train commuting.

I've just prototyped. It turned out to be a different animal than expected.

benjaminweb commented 5 years ago

…peeking 180 days in advance:

screen shot 2018-11-18 at 07 51 56

benjaminweb commented 5 years ago

Current State

https://bahn.jetzt/$variant/$startStationId/$stopStationId/$daysAhead and https://bahn.jetzt/$variant/$startStatioName/$stopStationName/$daysAhead return embeddable svg

Examples (100 days in advance Hamburg -> Munich)

Preis: http://bahn.jetzt/preis/Hamburg/München/14

screen shot 2018-11-26 at 02 04 02

Dauer: http://bahn.jetzt/dauer/Hamburg/München/14

screen shot 2018-11-26 at 02 03 07

EUR/min: http://bahn.jetzt/gewichtet/Hamburg/München/14 screen shot 2018-11-26 at 02 01 23

code lives at https://bitbucket.org/hyllos/bahn_preis_vtl

Possible Integration

=> bahn.guru calls bahn.jetzt with stationIds or stationNames and embeds svg


TODO/IDEAS

o subclass group to view hours in description of hover o include booking links o extend API by further options o tests: currently only partly o flixbus? o overview of top 10 relations? o shorten loading time (it's already asynchronous): how? o group durations per price {105.9: {'5:40', '5:41', '6:04', '6:36', '6:14', '5:51', '5:59', '6:47', '6:30', '5:42', '5:48', '5:45', '5:39', '5:37', '6:22', '5:44', '6:19', '6:25'}, 125.9: {'9:38', '6:30', '11:29', '5:48', '6:22', '6:24', '6:09', '5:41', '5:59', '9:21', '5:44', '7:05', '5:40', '5:38', '6:19', '6:25', '6:04', '6:14', '5:42', '10:01', '5:37', '5:39', '5:45', '11:30'}, 133.9: {'6:04', '5:41', '6:14', '5:59', '5:42', '9:21', '5:45', '5:40', '6:19'}, 139.9: {'6:04', '5:41', '6:14', '5:42', '5:37', '5:45', '5:44', '7:05', '5:40', '11:30'}, 150.0: {'6:04', '5:41', '6:14', '9:21', '5:42', '11:19', '5:48', '5:45', '5:39', '5:37', '6:22', '10:01', '5:44', '7:05', '5:52', '11:16', '5:40', '11:30'}, 75.9: {'11:08', '5:40', '5:41', '9:38', '10:18', '10:21', '9:21', '5:42', '5:48', '5:45', '10:01', '11:41', '6:22', '5:37', '5:44', '5:39', '8:19', '6:19'}, 89.9: {'11:08', '6:30', '5:48', '6:09', '5:41', '5:53', '9:21', '5:44', '7:05', '10:30', '5:40', '9:23', '10:21', '5:43', '6:38', '5:52', '6:19', '6:25', '5:42', '6:18', '10:01', '5:45', '5:39', '5:37'}, 157.5: {'5:53', '5:42', '7:05', '11:30'}, 115.9: {'5:42'}, 67.9: {'5:41', '6:30', '5:42', '5:44'}, 29.9: {'5:41'}, 45.9: {'5:41', '9:21'}, 59.9: {'5:41', '5:42'}, 25.9: {'5:41', '9:21'}, 47.9: {'5:41', '9:21'}, 95.9: {'5:41'}, 19.9: {'6:14'}, 49.9: {'5:40'}}

Recent Changes

o FIX: now display actual count of relations. o show price, duration and duration weighted price, see links above (addressing concern by @juliuste) o https://bahn.jetzt allows stationIds AND stationNames (powered by bahn-station-api) o y-axis: matches now absolute relation count (instead of %) o error message if unknown station specified o bahn.guru dependency: released -> directly accessing bahn.de o stationIds: (temporarily) instead of names o colours: switch to CleanStyle o bars: make absolute (relations per day) instead of (percent of relations per day) o relations too early to book: skip them o legend labels: shortened

Discussion: False positives mixed with positives

Problem statement:

scenario 1: 10 hour connection, price: EUR 100 scenario 2: 5 hour connection, price: EUR 50

weighted approach: produces same number for both: 10 EUR/hr

False positives: a. high price, high hours => user not interested b. normal price, high hours => user not interested

Positives: c. low price, normal hours => user interested

I. determine (A) low & (B) normal hours corridors II. determine low price corridor III. intersect: pick those relations within (A), (B) and II. IV. highlight matches from III.


simplify: => toss high hours.

benjaminweb commented 5 years ago

@juliuste => How can we take this forward?

juliuste commented 5 years ago

Great work 👍

It would be really cool to have this in the /calendar view of bahn.guru (and maybe also in the day view, showing hours instead of days). However - at least for the calendar - we should discuss if people should be able to switch between the "normal" view and this one or if we should just display both (e.g. the new view above the calendar, like on flight websites).

We also need to check how this looks on mobile.

I already have one request, though 😄 Could we move the diagram key from the left to the bottom of the chart (or the top) and maybe reduce the height of the y-axis a little so that we have something with an aspect ratio closer to 3:1 rather than 3:2 (would make it easier to add the diagram above/below the current calendar).

benjaminweb commented 5 years ago

Thanks for your feedback.

Let's rethink architecture before creating a chaos ;-):

o get_prices: factor out into dedicated prices API o plot_chart o draw_calendar

That prices API would enable others to create things we do not even dream of. Can't say when I will devote my time on the prices API.

btw: would it be a prices API or a Sparpreis API only?

benjaminweb commented 5 years ago

Update

ø get_prices: factor out into dedicated prices API ø plot_chart: renewed ø draw_calendar (segment): all routes (except /stationId) return HTML by default, json if Accept: application/json is part of headers

=> what's next?

Conceptualise entry page with search boxes, similar to bahn.guru's root page.

What's the coverage of the connections? All or only sparpreise?

benjaminweb commented 5 years ago

Update: Version 0.1.8.1 of sparpreis-api

https://bahn.jetzt

TODO:

nepumuk-fs commented 2 years ago

@juliuste ”since people are often looking for the lowest possible price, displaying the median price probably wouldn't help them. But it might make sense to have some other form of sorting, e.g. considering not only the price but the price per travel time (price maybe squared). I will think about this again.“ @derhuest That graph thing seems to go off course…

What about showing the cheapest price with its (shortest) travel time and the shortest travel time with its (cheapest) price?