the-carlisle-group / Acre-Desktop

A simple Dyalog APL IDE plugin that introduces "projects" and allows you to keep your source code in Unicode text files.
MIT License
11 stars 1 forks source link

New file format for variables #223

Closed PaulMansour closed 4 years ago

PaulMansour commented 4 years ago

As noted in issue #221, Acre has a problem writing and reading large variables. We optimize for character vectors, lists and matrices, so they are no problem. But large numeric arrays, simple or nested, cannot effectively be handled by APLAN, and I doubt if they would be efficient even if baked into the interpreter. There is a simple solution to this: use a component file to hold the variable rather than a text file, and have CreateProject have an option for specifying that vars should use a component file rather than APLAN. Optimized text arrays would still be stored as they are.

Acre would read any var file types that happen to be in the source folder.

Obviously there is some compatibility issue with component file across interpreters version, but I think it is negligible. Phil has implemented this and it is currently undocumented.

Any reason to not implement this feature? We would still need to settle on a file extension. Something like .dcfvar or .dcfarray?

Related is perhaps looking again at issue #211 and the recent changes with SetChanged and the vars parameter.

aplteam commented 4 years ago

I am happy with this.

PhilLast commented 4 years ago

One thing. I commented out one line, called a defined operator eu←{⍵ ⍺⍺¨{(⊂⍵⍳⍺)⌷⍺⍺ ⍵}∪⍵} instead of primitive each and changed an internal one-liner to do nothing and I can now write Mark's big var in APLAN

bigvar←?1000000 5⍴100

in 5 seconds and read it in 30.

It might not always give a similar benefit because I'm not sure where the benefit is coming from. Calling each-unique ought not give much benefit in this case as unique on the var saves only about 0.005%.

The removed code is the memoisation that does effectively the same thing but in several lines of code building and looking up a dictionary of previous computations. So I guess that the code to avoid doing the same thing repeatedly takes more time than the repetition.

PaulMansour commented 4 years ago

That's good. But I guess still relatively slow, if you have a lot of large vars.

PhilLast commented 4 years ago

Any reason to not implement this feature? We would still need to settle on a file extension. Something like .dcfvar or .dcfarray?

The current file-extension ".raw" and its corresponding CreateProject option value "-variables=Raw" both of which are outward facing, occur in only one place in acre where they are assigned internal names. So they can easily be changed in a minute.

I chose the word to try to signify the unprocessed nature of the file which is uneditable. An alternative in that vein would be "binary".

Whatever the decision my preference would be for them to retain their mutual similarity for ease of understanding both for users and the clarity of the code - don't laugh - 'though it's logically unnecessary.

A decision is required.

There is also mention above of the issue related to SetChanged's saving vars. What happens now is that if acre detects an existing ".raw" file for the var it saves it similarly; otherwise as a ".apla" in APLAN.

A related question then is whether SetChanged should also acquire the "-variables-On|Off|whatever" option specifically for newly added vars that currently default to ".apla".

PaulMansour commented 4 years ago

Lets go with binary.

PaulMansour commented 4 years ago

I would leave SetChanged alone for now.

PhilLast commented 4 years ago

OK