LukeWoodward / SplitsBrowser

Orienteering results analysis
GNU General Public License v2.0
13 stars 9 forks source link

Support variation CSV file format #4

Closed LukeWoodward closed 10 years ago

LukeWoodward commented 10 years ago

Event 6576 uses a CSV file format that vaguely resembles the SI CSV format. Support for this format should be added.

rmaro commented 10 years ago

With our home-made software HELGA we are using SplitsBrowser since ... the start ! We have developped an automatic extraction of the necessary file using the SI-CSV format, and transfer to our webserver (www.helga-o.com and menu "Results") - on this page, the graphs are behind the small flags. When I upload such a file on SplitsBrowser online, it works perfectly, but is rejected by the beta-version. Can you help. If necessary, I can put a file as a test file on SB online.

LukeWoodward commented 10 years ago

HI @rmaro,

Thanks for your comment. I've created a new issue to add support for the files you mention, as it's a different issue to this one. This issue relates to a format of data that is separated by commas, whereas the SI CSV format is actually separated by semicolons, despite the name. I had a quick look at one of the files from helga-o.com and it seems to be in a variation of the SI semicolon-separated format.

I know of two variations in the SI semicolon-separated format: one with 44 columns before the control codes and times, and one with 46 columns. The 46-column format was the first one supported, and issue #3 (which I've now fixed) adds support for the 44-column variation. The one file I've looked at from your website (http://www.helga-o.com/webres/splits/splitsbrowser.php?lauf=761) appears to use the 44-column variation. However, SplitsBrowser can't read it at the moment because the SI CSV reader uses the headers to detect whether the 44 or 46-column variation is being used and the file in question has had most of the headers removed.

I'm hopeful that the problem could be fixed simply by adjusting how the SI CSV reader detects the 44-column or 46-column format. However, as I've only given your file a quick glance, I can't guarantee that this will be enough. The existing SplitsBrowser Java applet appears to do this detection by searching for a number of strings in various languages, but I'd prefer to avoid that approach as it will only work in one of those languages

I've developed the rewritten SplitsBrowser more-or-less independently from the folks who run splitsbrowser.org.uk, so I haven't really been able to access much of the data that has accumulated there over the years. Most of the issues I've found with the beta have simply been because I did most of the initial development and testing against a small number of data files.

rmaro commented 10 years ago

HI Luke

thanks for your very fast and detailed reply.

Basically, we have 2 objectives

a) with the .jar, we are using the Zip file feature (but use the non-zip CSV is not really a problem) - the files remains quite small and zip is not mandatory. b) as stated in your intial comments for developping, we have more and more problem from users due these Java protection and so we are very interested in your solutions

if we can use both .jar and .js with the same export, we will propose to our HELGA users to continue in this way and, as done on SplitsBrowser on line, we can propose a double way to obtain the graphs. Later, we can cancel the .jar use.

Splitsbrowser.jar seems not to use the first line, therefore we have truncated it because it was not significant. But if mandatory for you, we have no problem to introduce a full header line as explained in your files. This header is also typical from SPORTident and the Stephan Kramer's OE software. In BEL, we are using Emit but we switched to this CSV format because of it was the one which allows to manage the start times for the "graph Absolute time".

I will check your new release with the 44 columns .... but can your supply me with the necessary Header line (= an updated "si-headers-only"). (or perhaps already in the new release ??

I will check this new features.

Regards Robert

Le 23/02/2014 22:56, LukeWoodward a écrit :

HI @rmaro https://github.com/rmaro:

Thanks for your comment. I've created a new issue to add support for the files you mention, as it's a different issue to this one. This issue relates to a format of data that is separated by commas, whereas the SI CSV format is actually separated by semicolons, despite the name. I had a quick look at one of the files from helga-o.com and it seems to be in a variation of the SI semicolon-separated format.

I know of two variations in the SI semicolon-separated format: one with 44 columns before the control codes and times, and one with 46 columns. The 46-column format was the first one supported, and issue

3 https://github.com/LukeWoodward/SplitsBrowser/issues/3 (which

I've now fixed) adds support for the 44-column variation. The one file I've looked at from your website (http://www.helga-o.com/webres/splits/splitsbrowser.php?lauf=761) appears to use the 44-column variation. However, SplitsBrowser can't read it at the moment because the SI CSV reader uses the headers to detect whether the 44 or 46-column variation is being used and the file in question has had most of the headers removed.

I'm hopeful that the problem could be fixed simply by adjusting how the SI CSV reader detects the 44-column or 46-column format. However, as I've only given your file a quick glance, I can't guarantee that this will be enough. The existing SplitsBrowser Java applet appears to do this detection by searching for a number of strings in various languages, but I'd prefer to avoid that approach as it will only work in one of those languages

I've developed the rewritten SplitsBrowser more-or-less independently from the folks who run splitsbrowser.org.uk, so I haven't really been able to access much of the data that has accumulated there over the years. Most of the issues I've found with the beta have simply been because I did most of the initial development and testing against a small number of data files.

— Reply to this email directly or view it on GitHub https://github.com/LukeWoodward/SplitsBrowser/issues/4#issuecomment-35845037.

LukeWoodward commented 10 years ago

This is the header line of the one 44-column file that I have (event 6597 on splitsbrowser.org.uk):

Stno;SI card;Database Id;Name;YB;Block;nc;Start;Finish;Time;Classifier;Club no.;Cl.name;City;Nat;Cl. no.;Short;Long;Num1;Num2;Num3;Text1;Text2;Text3;Adr. name;Street;Line2;Zip;City;Phone;Fax;EMail;Id/Club;Rented;Start fee;Paid;Course no.;Course;km;m;Course controls;Pl;Start punch;Finish punch;Control1;Punch1;Control2;Punch2;Control3;Punch3;Control4;Punch4;Control5;Punch5;Control6;Punch6;Control7;Punch7;Control8;Punch8;Control9;Punch9;Control10;Punch10;(may be more) ...

The (may be more...) isn't me cutting the list short, that is actually in the file.

This header line isn't always in English (in at least one file I have it's in French), so you can't rely on exact matches with the names. The SI CSV reader currently looks for the first two consecutive column headers that end with the digit 1, and takes those as the control code and the time taken to the first control. Other than that, the headers aren't used.

It should be possible to detect whether the 44 or 46-column variation is being used by searching backwards from the end of the first line and seeing where the control codes end. This will allow SplitsBrowser to be used with existing data which has already been created without the header line.

LukeWoodward commented 10 years ago

I've spent some time this evening fixing the issues I found with a couple of the files I got from helga-o.com (752 and 761). I've fixed all the issues I found reading in those two files and so have now closed issue #5.

Unfortunately, it's still not possible to view the data from either of these files in SplitsBrowser yet, as both of them are hit by issue #2. If I disable the checks for the cumulative times being in order, then I can view the data in these files, although it does look a little odd. I'd like to improve the way SplitsBrowser handles such data, so I don't think it's sufficient to just disable these checks.

LukeWoodward commented 10 years ago

Fixed six days ago.