LukeWoodward / SplitsBrowser

Orienteering results analysis
GNU General Public License v2.0
13 stars 9 forks source link

SplitsBrowser as a proxy? #63

Closed sfreytag closed 5 years ago

sfreytag commented 5 years ago

This is not really an issue, more just something I'm curious to ask. I was wondering about a proxy to make SplitsBrowser work for all the SportIdent HTML results already out there, without an additional upload to a SplitsBrowser backend.

I like to think my mate Ed invented O event graphs, when he did an epic spreadsheet to track the Scottish 6 Days for a group of us back in 2003 using a graph to show us changing positions over the 6 days. In reality I guess a few people were thinking about the same things, because SplitsBrowser appeared soon afterwards and got it totally right. It was very insightful for me - it stopped me thinking my mistakes did not matter when I saw how much the graph plunged downhill. Then the applet became buggy, but it got resurrected in its current modern structure, and became even nicer to use, so thanks for your efforts.

The only problem is that not many events get uploaded through the backend, so it feels like this frontend does not get as much use as it should. I am only an occasional orienteer so I have not been involved with event admin, but I guess once the organiser has uploaded a set of results & splits, there is increasingly less motivation to start uploading the other things - routegadget, winsplits, splitsbrowser, etc. Which is a shame because I reckon this is a better analysis tool than just looking at a table of splits.

We were at a small local event last night for improvers, and someone was wondering if they needed to focus on improving by eradicating mistakes, or by just running and navigating a bit faster... I thought a splitsbrowser graph would be just the thing to answer that question.

The event results are available in the standard HTML:

https://www.mdoc.org.uk/results-archive/2019/2019-05-21-lyme-intro-2/stage1_light_green_course_splits.html

I downloaded that HTML and wrote a bit of JS to hack it into SplitsBrowser CSV format, as described on the wiki page about using SplitsBrowser offline. It worked great and I now have that event available on splitsbrowser, albeit just running on my localhost. If you're interested, the commits are here:

https://github.com/sfreytag/SplitsBrowser/commits/experimental-proxy

Hence I was wondering if SplitsBrowser could be set up to read any SportIdent results file over the web, such that it can be used with any set of existing results without an additional upload. I don't think this process can be built into this frontend codebase because of cross-origin problems - an Ajax request to that MDOC page fails, for example. So I think it would need a little proxy service that would:

  1. Take a URL of any SportIdent HTML splits upload
  2. Fetch that server-side, and parse it into SplitsBrowser CSV, improving on the logic I've done already (more robustness needed)
  3. Load the SplitsBrowser front end and give it the results of (2)

Alternatively the proxy could just fetch the file, and the parsing could be put into the SplitsBrowser codebase by extending your existing HTML parser. But I thought that is probably not welcome, because this is essentially a screenscrape of unstructured data, and so belongs somewhere else, rather than a genuinely useful addition to the parsers.

Have you thought about this already and would you have any insights or thoughts about it?

Do you know if there is any work done already to parse SportIdent HTML?

LukeWoodward commented 5 years ago

Hi Simon,

Thanks for your message, and for the historical information.

As you noted, SplitsBrowser contains some functionality for parsing HTML-format results files. However, that's because the old Java applet did so too. If I was going to write something to replace the Java applet I'd at least need my replacement to do everything the Java applet currently did and parse (most of) the files that had already been uploaded to splitsbrowser.org.uk.

I don't really want to add any further HTML parsing capabilities to SplitsBrowser. There are a number of reasons for this:

I also have to wonder whether MDOC (or any other club site hosting SportIdent results) would appreciate their pages being scraped to show the data elsewhere.

I'm not aware that anyone else has tried to work on parsing SportIdent's current HTML format. It isn't something I've been asked about nor thought about before.

I appreciate that it's effort for an event official to upload their results files to several different places. I'm not sure what the best solution to that problem is, although you're not the first person to mention it to me. Indeed, I believe SplitsBrowser was added to RouteGadget to give event officials one fewer place to upload to.

Anyway. I hope some of this is informative for you, and thanks for your positive comments about the work I've done on SplitsBrowser.

Yours,

Luke