highperformancecoder / minsky

A systems dynamics economics modeling software
http://minsky.sf.net
GNU General Public License v3.0
290 stars 51 forks source link

Feature/csv download by line #506

Closed Rdmkr closed 3 months ago

Rdmkr commented 3 months ago

This is very close to a working solution, but crashes hard (most of the time..?) on my PC... any idea what could be wrong? The logs aren't much help. Also I wonder how much of the logic from loadWebFile we still need, as it seems to me most of it is covered by electron's file download client.


This change is Reviewable

Rdmkr commented 3 months ago

I dealt with the timing and URL parsing issues, but there is still a crash.

highperformancecoder commented 3 months ago

This is very close to a working solution, but crashes hard (most of the time..?) on my PC... any idea what could be wrong? The logs aren't much help. Also I wonder how much of the logic from loadWebFile we still need, as it seems to me most of it is covered by electron's file download client.

This change is Reviewable

I think it is right to discard the loadWebFile C++ method.

I'll take this for a spin now, and see if I can spot the cause of the crash.

highperformancecoder commented 3 months ago

OK - took it for a spin. There are some minor build errors, which I'll leave you to fix, but then I tried loading the test dataset at https://www.hpcoders.com.au/BIS_GDP.csv, and it loaded beautifully - no crash. (BTW - there is a C++ unit test, that will need to be converted into a jest test here).

I also tried https://sourceforge.net/p/minsky/ravel/20/attachment/BIS_GDP.csv (which we had some difficulties before), and it worked great too!

Rdmkr commented 3 months ago

I tried one of the files supplied by Steve on the ticket: https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.txt Two things notable about this: 1. the extension is .txt (easy enough to solve by changing it to .csv or interpreting it as such - idk what your C++ code does with this) 2. there are comments at the top of the file, which iirc is not according to the strict csv spec, but tolerated by some parsers... maybe ours needs to start doing that too... Can we at the very least fix the error handling so there is some feedback on why this is failing? It is frustrating to me in development but also not something we should want our users to be exposed to.

ps. Steve had trouble with this one too: https://www.ncei.noaa.gov/pub/data/paleo/pages2k/neukom2019temp/recons/Full_ensemble_median_and_95pct_range.txt

highperformancecoder commented 3 months ago

The first file causes an assertion failure due to an out of bounds access on an empty line. I'll fix that tomorrow. The second file loads fine - see the image snapshot. paleo

highperformancecoder commented 3 months ago

So the crash you noticed was a bug that was already fixed on master. The land-open temperature index cannot quite be imported, because the merge-delimiters option is not quite working. I've raised a ticket for that, but it needn't block merging this PR, as it is completely orthogonal to the concerns of this PR. There's 3 lines where the build is breaking. We can raise a ticket to deal with saving the URL to schema for downloading updates later. Its not something we're supporting right now. We're planning on a full release early next week, so I'm keen to get this change merged ASAP.

Rdmkr commented 3 months ago

Just committed some C++ build fixes.

Rdmkr commented 3 months ago

I also committed code for zip decompression. This takes the first file in a zip... not sure what to do if there are multiple, but it's already a stretch to accommodate zip file users...

highperformancecoder commented 3 months ago

I also committed code for zip decompression. This takes the first file in a zip... not sure what to do if there are multiple, but it's already a stretch to accommodate zip file users...

Just a minor thing - you missed committing the changes to package.json. I did a "npm install decompress", and all was good.

Oh - the failure to exit on close stuff - I'm getting a message about at Map.forEach () at BrowserWindow. (/home/rks/github/minsky-niels/gui-js/dist/executables/linux-unpacked/resources/app.asar/minsky-electron/main.js:158:108)

I'd say the download window is not being destroyed properly. I'll take a look.

highperformancecoder commented 3 months ago

The culprit is the progress bar. Putting progress.close(); as the last line of the 'done' handler in CommandsManager.downloadCSV() solves the problem.

For completeness, it might be worth adding that line to the done handlers in downloadMinsky() and downloadRavel(), although I've never seen the same problem, possibly because app.quit() is called in those methods.

Rdmkr commented 3 months ago

One remaining issue I can think of: this puts one downloaded.{extension} file in /tmp for every extension that is ever downloaded. In the case of zip it will first put the .zip file there and then the inner file next to it. Maybe we should delete files from there that aren't used.

Rdmkr commented 3 months ago

What's still preventing us from completing this PR?

highperformancecoder commented 3 months ago

On Sat, Jun 01, 2024 at 08:02:35AM -0700, Rdmkr wrote:

What's still preventing us from completing this PR?

Nothing! I intend to merge tomorrow morning :). I just had other things to do today. I only switched on my laptop for the first time today 10 minutes ago.

--


Dr Russell Standish Phone 0425 253119 (mobile) Principal, High Performance Coders @.*** http://www.hpcoders.com.au

highperformancecoder commented 3 months ago

One slight hiccup - the hang on quit bug was back again, and closing the progress bar turned out to be a red herring (sorry!). I was able to solve the problem by adding a 'destroyed' handler to each Windows' webContents to remove the window from the activeWindows map.