red / REP

Red Enhancement Process
BSD 3-Clause "New" or "Revised" License
11 stars 4 forks source link

Another lament on Redbin limitations, or why sparse designs are bound to be reinvented until complete #155

Open hiiamboris opened 1 year ago

hiiamboris commented 1 year ago

In my recent work on ParSEE visualization tool I've encountered the need to save the dump of all Parse events, to later load it in the GUI tool to analyze.

Reasons for such split design are:

  1. Spaces (on which GUI tool is built upon) are a big library. Requiring it at the place of parsing would be unwise because:
    • countless #include bugs will make usage of such tool pure hell, so no one will use it
    • such a big library is likely to affect the original program (consider custom event scheduler, custom mold implementation, hacked console, and many exports)
  2. Requiring GUI would require GUI functionality, which is not always available (say, on headless servers), and would limit the applicability of the tool
  3. It is simply convenience sometimes to save multiple results and process them later, without the need to reproduce them every time (also useful if hard to reproduce)

While point 1 could in theory be addressed in some far far future by a solid module system that could provide enough convenience and isolation, points 2-3 will remain valid, validating the whole need for backend/frontend separation.

Saved dump consists of thousands of events, each including input Parse is working on and rule processing the input:

Traditional mold/load cycle is inappropriate as it would destroy both sameness and offsets of all series:

Redbin was supposed to be of help here, but its shortcomings make it a kludge rather than a solution.

Current Redbin implementation cannot save:

So when I try to save such dump of values I only receive an error most of the time.

To force Redbin to save the dump I had to recursively preprocess whole tree of values (including both input and events dump) in the following manner:

  1. Copy each any-block series and keep a map [old block -> new block] - needed for (2)
  2. Replace each any-block with it's copy from the map (1) - needed for (3-5), as I cannot modify the original rules/input in place or I'll break the parser
  3. Bind every word to the global context so it loses its binding during Redbin encoding (it's a special case in Redbin at the moment)
  4. Replace complex (objects, maps, functions) values with their abbreviations, as I don't need their content because Parse cannot enter them. This helps me avoid deep preprocessing of these values
  5. Replace values unsupported by Redbin by their abbreviations

For more general preprocessing I would also need include objects, maps and functions into the deeply preprocessed data set.

In essence, it's a slow high-level reinvention of the logic of Redbin, which by my observation is also quite hard to get done correctly and reasonably fast. It would thus be nice if Redbin didn't require invention of such kludges in order to just use it.

dockimbel commented 1 year ago

Current Redbin implementation cannot save:

  • values of types native! action! routine! handle! event! op!

??

>> rb: system/codecs/redbin
>> probe rb/decode rb/encode :insert none
make action! [[
    {Inserts value(s) at series index; returns series past the insertion} 
    series [series! port! bitset!] 
    value [any-type!] 
    /part "Limit the number of values inserted" 
    length [number! series!] 
    /only {Insert block types as single values (overrides /part)} 
    /dup "Duplicate the inserted values" 
    count [integer!] 
    return: [series! port! bitset!]
]]
== make action! [[
    {Inserts value(s) at series index; returns series past the insertion} 
    series ...
>> probe rb/decode rb/encode :parse none
make native! [[
    "Process a series using dialected grammar rules" 
    input [binary! any-block! any-string!] 
    rules [block!] 
    /case "Uses case-sensitive comparison" 
    /part "Limit to a length or position" 
    length [number! series!] 
    /trace 
    callback [function! [
        event [word!] 
        match? [logic!] 
        rule [block!] 
        input [series!] 
        stack [block!] 
        return: [logic!]
    ]] 
    return: [logic! block!]
]]
== make native! [[
    "Process a series using dialected grammar rules" 
    input [binary! any-block! an...
>> probe rb/decode rb/encode :+ none
make op! [[
    "Returns the sum of the two values" 
    value1 [scalar! vector!] "The augend" 
    value2 [scalar! vector!] "The addend" 
    return: [scalar! vector!] "The sum"
]]
dockimbel commented 1 year ago

Current Redbin implementation cannot save:

  • values within system/words context

That was not part of Redbin goal, which was to provide a way to serialize local Red data accurately without pulling the entirety of the global context (== whole Red runtime environment). Even in its current form, Redbin is already pulling some parts of the global context, which is not always desirable. A possible evolution of Redbin could include a way to control how "far" it pulls references, so the user can scale it for its specific needs.

hiiamboris commented 1 year ago

Thanks for correcting me, I've removed natives and actions from that list.

That was not part of Redbin goal, which was to provide a way to serialize local Red data accurately

I understand, yes. But local data may include global words that need to be saved, or it may contain words like system which enforce unwanted global context inclusion, so in real code it becomes a tangled mess in need of deep and meticulous preprocessing.

hiiamboris commented 1 year ago

Perhaps the easiest patch would be to let it accept a callback, either to handle all values, or only those it can't save, rather than failing. And some save-anything callback available out of the box.

But these are just some thoughts and a use case to inform the big picture.

hiiamboris commented 1 year ago

I had other thoughts on Redbin generality here

greggirwin commented 1 year ago

The big picture thinking, and a real use case like this, is great. Thanks @hiiamboris. :+1:

hiiamboris commented 11 months ago

Another illustration of how bad it gets - saving two scalar values, carrying over the whole runtime:

f: function [geom [map!]] [
    unless geom/offset [geom/offset: 0x0]
    unless geom/size   [geom/size: system/view/screens/1/size]
    save %test.redbin probe geom
    view/options [button "TEST" [unview]] [size: geom/size offset: geom/offset]
]
f #()

Output:

#(
    offset: 0x0
    size: 1280x720
)
*** Access Error: cannot decode or encode (no codec): routine ["Internal Use Only"][bool: as red-logic! stack/arguments bool/header: T
*** Where: encode
*** Near : codec/encode :value dst
*** Stack: f save

I have to use text for state files module, because Redbin is a no-go. Of course that bears another risk.

dockimbel commented 11 months ago

That last example looks like a bug where the words in the map are wrongly pulling their context instead of being processed just as symbols.

hiiamboris commented 11 months ago

Is it a bug in Redbin not having a special case for maps, or in maps for not removing words binding then?

9214 commented 7 months ago

That last example looks like a bug where the words in the map are wrongly pulling their context instead of being processed just as symbols.

I'm pretty sure that's by design of map! itself:

>> map: to map! bind [foo: 'bar] context [foo: 'baz]
== #[
    foo: 'bar
]
>> get probe last keys-of map
foo
== baz
>> unset? :foo
== true

provide a way to serialize local Red data accurately

Which it evidently does, judging by the example above 🤷‍♂️

special case for maps

For the record, at the time of implementation I didn't know there supposed to be one. IIRC all there is to map! is a convenient key/value wrapper over hash!.

9214 commented 7 months ago

WRT inability to encode global values, I think it can be rectified by collecting them in a separate context, which would serve as a localized substitute for system/words. Basically:

>> foo: 'bar
== bar
>> save/as #{} [foo] 'redbin

Would be the same as:

>> save/as #{} bind [foo] context [foo: system/words/foo] 'redbin

As for the other possibility, with unmarshaling values from Redbin payload straight into system/words: imagine loading an innocious payload where global + is set to a function that reads your home folder, sends its content over the network, and then blows up your PC. Next time you'll evaluate anything in Red it will likely do just that.