automeris-io / WebPlotDigitizer

Computer vision assisted tool to extract numerical data from plot images.
https://automeris.io
GNU Affero General Public License v3.0
2.67k stars 363 forks source link

Feature suggest: include extracted data .csv into project archive .tar #230

Open nbehrnd opened 4 years ago

nbehrnd commented 4 years ago

Running WebPlotDigitizer allows to save the project in a .tar archive, including the original picture, to replicate and repeat with other parameters the data extraction. While file wpd.json obviously includes the extracted data, as such it is not useful to be imported and then plot by gnuplot or a GUI like e.g., gnumeric. Thus, I would like to suggest that the .csv accessible from the intermediate menu

window

as download to be included per default into the project archive as an additional file. I speculate to redirect a copy of the already prettified output (download the .cvs) into the archive demands little development time only, yet may ease the use of the program even further. Uploading the .tar to revisit the project probably would not be bothered about the additional file, either. The archive attached, example.zip, includes the project archive boiling_point.tar intended to illustrate the point.

Of course there are more reasonable ways to recreate a plot like the one here as values of the abscissa should be integers only. The post's purpose is just to illustrate the situation. The source of the plot accessed is this Wikimedia commons page. example.zip

nbehrnd commented 4 years ago

Given the problems encountered in issue 231, I tend to add the suggestion to replace using the .tar format for the container of these data by the .zip archive. Maybe the compression obtained by rar is superior to the one by zip. However, given the size of the data to transfer, the benefits of accessing the zip archive virtually by default in an instance of Linux, Mac OS and Windows, may outweigh the missed performance of compression.

Suggest: To access legacy data, WebPlotDigitizer shall continue to read .rar archives. New project archives shall be written into and read from .zip archives.

ankitrohatgi commented 4 years ago

Including additional CSVs in the tar archive is feasible and I am not sure if switching to zip helps much. The tar archive is created and read on client side without any data transfer over the network so compression would only save the final disk space usage for the user (this uses my hand coded implementation since HTML5 doesn't do such files out of the box: https://github.com/ankitrohatgi/tarballjs).

I'll hang on to this request for now until there is more bandwidth to implement this.

nbehrnd commented 4 years ago

Ankit,

the idea to substitute .zip by .rar only came because .zip appears for me like a even lower (much more widespread) common denominator, than .tar in Windows machines. Or here, GitHub permits the submission of a .zip, but not of a .rar. Thus I thought -- not knowing a iota about javascript -- maybe a module writing / reading .zip possibly could be, as experienced in Python, part of the standard library.

I read your reply that the assumption was wrong. Since my choice for Windows or Linux anyway depends on the task ahead, the ticket remains a mere suggestion.

Thank you.

On Sat, 17 Oct 2020 21:23:34 -0700 Ankit Rohatgi notifications@github.com wrote:

Including additional CSVs in the tar archive is feasible and I am not sure if switching to zip helps much. The tar archive is created and read on client side without any data transfer over the network so compression would only save the final disk space usage for the user (this uses my hand coded implementation since HTML5 doesn't do such files out of the box: https://github.com/ankitrohatgi/tarballjs).

I'll hang on to this request for now until there is more bandwidth to implement this.