qcif / data-curator

Data Curator - share usable open data
MIT License
264 stars 38 forks source link

Perform csv pre-check on loading csv data and provide feedback to user if there is a problem with the data before attempting to load #651

Closed ghost closed 6 years ago

ghost commented 6 years ago

See: #611

Hi @louisjasek I'll have a look at 1 and 3 shortly. But for Q2, the problem here is the dataset Invalid closing quote at line 23367; found "\"" instead of delimiter "," Data Curator cannot load an invalid csv. @Stephen-Gates, perhaps we need to add an enhancement which pre-checks csv data for validity and then provides more useful feedback to the user. That way, the user isn't left sitting there indefinately.

Stephen-Gates commented 6 years ago

I think this is a good idea. I’ll place it in 0.17.0

Stephen-Gates commented 6 years ago

@mattRedBox will this help? https://github.com/frictionlessdata/tableschema-js/pull/143

ghost commented 6 years ago

@Stephen-Gates Possibly, was also about to add reference to this ticket (but you beat me to it :) )

ghost commented 6 years ago

Apart from dialog, also implement a timeout on any ui blockout messages to ensure that problems or bad data don't leave loading screens in place indefinately.

louisjasek commented 6 years ago

I think we should try for as simple as possible here. @mattRedBox I like the idea of a time-out. The error message could be vague if there are other reasons it might not load. E.g. "Something went wrong. We couldn't load your file. Please check the file is valid and try again". And maybe we could link to our forum or a third-party resource with a description of what to check to make sure your file is valid.

ghost commented 6 years ago

Hi @Stephen-Gates, @louisjasek Sorry spilled out this issue in a hurry the other day so I didn't forget. We already have the validation really in the fact that an error occurs - I just need to move this up and mask with feedback message for a user. The timeout won't be difficult to implement either. There'll just be some time taken to write the code and test out for a few scenarios.

Stephen-Gates commented 6 years ago

Do you need me to make test data @mattRedBox

ghost commented 6 years ago

I'll implement this based on datasets that failed from Qld Govt that we've seen with Paul's help and the one from @4estone. But perhaps for your own rigorous testing other types of failures should be applied. What I'm looking to do is make sure the error is fed back as soon as the app fails (so withdraw from the data load immediately), but as a catch-all if that fails for whatever reason then the worst case scenario is that the loading stops say after 25s, with an error message that there was a timeout and a problem with the data load - we can always tweak the error message later to something more relevant (and of course we can tweak the timeout duration but we probably need to account for less powerful laptops/desktops that might take a bit longer to load data anyway).

Stephen-Gates commented 6 years ago

Consider refining the timeout value to support opening large files.

(Could add timeout value to Set Preferences?)

ghost commented 6 years ago

Hi @Stephen-Gates That's probably the only way to do it that makes sense to me

louisjasek commented 6 years ago

If possible and quick, also implement simple error notification to support #277. It is a nice to have, so do not spend time on this over other functionality.

ghost commented 6 years ago

Hi @louisjasek Yes I think we can fit it in, but otherwise will let you know.