Variables not replaced within curly braced {}

lmorg / murex

A smarter shell and scripting environment with advanced features designed for usability, safety and productivity (eg smarter DevOps tooling)

https://murex.rocks

GNU General Public License v2.0

1.48k stars 27 forks source link

Variables not replaced within curly braced {} #485

Closed Pascalio closed 1 year ago

Pascalio commented 1 year ago

Describe the bug: It seems variable expansion is not performed within {}... using version v2.11.2200 on Linux. set foo = bar set struct = {$foo} $struct {$foo}

Expected behaviour: Typing $struct should yield bar, right?

Screenshots: Screenshot_20221107_201242

This is a problem if you want to create json data from variables... Apart from concatenating strings then casting to json, is there any other way to create:

{
 "bar": "baz"
}

from $foo?

lmorg commented 1 year ago

Hi @Pascalio

That's intentional. Curly braces are essentially string literals to make it safer inlining JSON and/or sub-shells (re-reading the docs on this, they suck bad: https://murex.rocks/docs/parser/curly-brace.html sorry I'll add that to the todo). Whereas the square brackets have no special meaning at the parser level.

If you want to build some JSON up programmatically then you'll need to wrap those braces around some parentheses: https://murex.rocks/docs/parser/brace-quote.html.

» set foo = bar
» set struct = ({"foo": "$foo"})
» $struct
{"foo": "bar"}

lmorg commented 1 year ago

Sorry, I should add, the reason behind that design decision is because having curly braces as string literals means that each sub-shell can lazy-load each curly brace block of code without the programming having to worrying about escaping variables.

let i=0
while { = i < 10 } {
  out $i
  let i++
}

^ this would just print a column of zero's if the curly braces weren't treated as string literals because the parser would have expanded $i. Which would mean the developer would need to escape it. But when code starts getting nested you'd need to figure out how many times you need to escape....and well it just becomes a mess.

So by making them string literals you solve that problem but you then need to quote the braces if you want to override that string literal behavior.

The same problem doesn't exist with square brackets, but you raise an interesting question....should that also be treated as a string literal?

Pascalio commented 1 year ago

Thank you @lmorg ! I'm starting to understand. This raises me a lot of questions... I'll mull them over before posting daft ideas and come back to you.

lmorg commented 1 year ago

I'd definitely love to hear your thoughts, whenever you feel ready.

lmorg commented 1 year ago

@Pascalio Any update on this? Or are you happy for me to close?

Pascalio commented 1 year ago

Hi @lmorg ! I'm so sorry, this topic took me on a grand tour of further questioning... Which took a lot of time and hesitation.

To come back to the reported problem, the solution of wrapping the whole thing in bracket-quotes does work and maybe I'd better stop the fuss there. However, it is a bit sad that such a fundamental feature as the ability to dynamically create associative arrays or complex objects (my attempt with json here) doesn't feel native. We have to resort to a hack, to creating the string from bits in order to have our object. I have thought of many proposals to resolve this lack of native support, but in the end, I keep stumbling on a certain limitation of the original murex design... I say this with humility, as I've never endeavoured to create my own language, so please bear with my potentially naive views.

Though a paradigm of shell languages, I reckon the notion of "string expansion" is faulty. A new language should aim for unconditional coherence, so that a piece of grammar will have the same value in any context. If "this" does "that" "here", then it does "that" everywhere. In this way, it is more memorable, feels worth learning, and becomes composable in patterns. The problem with string expansion is that it's close to what a pre-processor does in a compile language. But as a programmer, you want variables, not pre-processor-ish substitutions. That is, in your example, we'd clearly need an "access to variable" rather than "string expansion".

let i=0   # variable set
while { = $i < 10 } {  # value of variable accessed
  out $i  # value of variable accessed at the moment of execution
  let $i++ # value of variable mutated at the moment of execution
}

^ in this example, the "$" was used to say "now access the variable behind the given name, don't use the name itself". In this way, curly braces don't need to be string literals. Another way to put it, would be to say that variables are expanded ephemerously at the moment of command execution. I don't know if such a change of paradigm makes sense for murex, or at all... But I think it would resolve many grammar issues. Like:

let n = 1
= $n + 1
# 2, not 11 and no need to explain why some surprising result would happen

set key = foo
set object = {"$key": $n} # we can support objects "natively"

What do you think?

lmorg commented 1 year ago

You do make some excellent points. I particularly like the analogy to a macro parser.

let n = 1
= $n + 1
# 2, not 11 and no need to explain why some surprising result would happen

This is actually one of my biggest bugbears with the language. It frequently trips me up so I can completely relate why you raise it as being ugly. The history behind that is partly because everything is a function and partly because I got lazy and imported a module to do math notation (in the early days when murex was a POC, I wanted to focus on what brought value to the code. But I never got round to replacing that package with one that works more natively with the murex syntax).

There is an open ticket for me to resolve this (https://github.com/lmorg/murex/issues/134) however I've never allocated any time to it because the status quo was "good enough". Given your comments, I'll re-prioritize that work.

set key = foo set object = {"$key": $n} # we can support objects "natively"

Unfortunately this is a little trickier to get right because it could introduce breaking changes to existing scripts. That isn't to say I'm against making a change, just that the syntax would need to be explicit rather than implicit so that we mitigate the risk of silently breaking running code.

My immediate first thought is we change the syntax up so not to use set at all. In an ideal world that would mean something like: object = {"$key": $n}. Though I'd need to rewrite the parser before I could support that (I'm not against rewriting the parser, that too is POC code that never matured further because it was "good enough").

This would mean that something like:

set $key = foo
let $n = 0

out json: {"$key": $n}
# would still produce the literal string: {"$key": $n}

out json: ({"$key": $n})
# would produce: {"foo": 0}

json = {$key: $n}
out json: $json
# would produce the object: {"foo": 0}

This could even be expanded to understand the difference between strings and numbers. eg

key = "foo" # string because quoted
n = 0       # number

(just like how normal programming languages work hehe)

This would allow us to deprecate set and let without breaking backwards compatibility.

However this is a pretty substantial piece of work though. So I might have to introduce it in stages, depending on how eager yourself and others are to review the changes. So I could in theory include the string interpretation and variable behavior as a new builtin (eg var or exp) while I rewrite the parser to allow support for this as native behavior.

(that said -- and I'm "thinking out loud" here, it might not be a fundamental change to the parser as I believe because murex already has a concept of special parameters that modify the behavior of each command. So this could piggyback on that pattern...)

Thanks again for your feedback. As you've probably guessed, almost all of murex's design and development has been done by myself so there will be plenty of places where I've simply not spent enough consideration on. Hearing other peoples thoughts enables me to write better code and build a better shell, not only for yourself but myself too. So feedback like this is really valuable. Thank you

Pascalio commented 1 year ago

Thank you for your welcoming open-mindedness and responsivity! It's great!

Unfortunately this is a little trickier to get right because it could introduce breaking changes to existing scripts. That isn't to say I'm against making a change, just that the syntax would need to be explicit rather than implicit so that we mitigate the risk of silently breaking running code.

Yes, you're right, backward compatibility is important. Your idea of transitioning by leaving set/let as they were and implementing a new syntax sounds clever. Especially since the new one is more concise.

set key = foo
let n = 0

out json: {"$key": $n}
# would still produce the literal string: {"$key": $n}

So, this ^ would mean that the magic only happens in the var_name = data syntax, but otherwise in the code, $var_name wouldn't be a "variable access", right? Or would it mean that the {} chars have the ability to block variable access? Anyway, maybe I'm doing linguistical nitpicking here... More importantly, and along the lines of your suggested syntax, I can also suggest another approach.

Of native arrays and objects

Currently, arrays are represented as "item1\nitem2\n...." which isn't handy to create. my_array = apples\nbananas\ncoconuts Fortunately, there's a command to ease things. This is nice, but again, doesn't feel like array creation is "natively supported". Practically, it means we can't my_array = a: [apples,bananas,coconuts] we must my_array = ${a: [apples,bananas,coconuts]} Why not have an alternative representation of arrays, so that we can directly assign them to variables? Something like: [Item1, Item2, Item3] (where strings containing " , " must be quoted...) This would be alongside the current newline representation so that both

[apples, bananas, nuts] -> foreach: fruit {out: have some $fruit}
ls -> foreach: inode {rm $inode}

would work.

Then we could say that arrays are simply objects with auto-assigned numerical keys. [Hi, you] == [0: Hi, 1: you] And then reuse the same syntax for complex objects.

foo = scones
bar = clotted cream
grapes = [grape1, grape2, grape3]
healthy_lunch = [base: $foo, additional: $bar, "why not": $grapes]

We'd still have the "[]" command though. $healthy_lunch -> ["why not"] -> foreach: grape {swallow: $grape}

And "{" would still prevent variable expansion/access.

Or have I gone just banana shake?

lmorg commented 1 year ago

That could work. The ->[] index is only usable as a method so it shouldn't be hard to say when [] is a function then generate an array, but when it's used as a method, filter instead.

I'd probably lean towards removing the comma (eg [grape1 grape2 grape3]) though. The reason being that you don't comma separate parameters in shells, so it would keep the syntax more "shell like" while still providing a richer syntax.

Pascalio commented 1 year ago

Agree, it's better without the comma!

Yes, [] is a method when its stdin is piped, so it could be seen as a function when not, and then generate arrays. But then, if it's a function, we can't assign it directly to a variable, right? We must set var = ${ [ stufff ]}. Can't set var = [ stuff ]

lmorg commented 1 year ago

I can make var = [1 2 3] work but that would be a special case.

As an aside, I really need to cut down on these special cases. The problem is backwards compatibility and the expectation for parameters to be barewords in shells.

re the comma, would it make sense to make the delimiter space and/or comma. So you can have LISP-like lists (eg [ apples pears oranges ]) but also drop in raw JSON too (eg [ "apples", "pears", "oranges" ]). I think that might hit the sweet spot convenience wise...

Pascalio commented 1 year ago

Yeah absolutely, special cases should be avoided as much as possible. That's why I suggested that arrays be "represented" like [1 2 3], and not that [ be a new command. Currenty, arrays are strings with newline chars, right? "1\n2\n3" is an array. If [1 2 3] is an array (too), then no need for a [ command and its special cases. Does it make sense?

Regarding JSON style, sure if you deem it clever and it doesn't get in your way. I was thinking the format json command would instead be applied to our [1 2 3] array to produce the json styled one, but maybe you have use cases I haven't thought of where it's better to create it in json style?

lmorg commented 1 year ago

The problem is that shells accept bareword parameters. So how does one differentiate between

command [ this is a string ]

and

command [ this is an array ]

I guess could argue that quoting the first command forces it to become a string, eg

command "[ this is a string ]"

...and make it explicit in the docs that the shell does "best guess" interpretations of how a bareword should be?

Another option is to add a little syntactic sugar around objects being infixed. For example (and to borrow from Perl), %{} is an object and @[] is an array. This adds an extra character to what you envisaged but does make intent explicit.

Both options have their benefits and drawbacks so I'm undecided about which is the smarter option.

I was thinking the format json command would instead be applied to our [1 2 3] array to produce the json styled one, but maybe you have use cases I haven't thought of where it's better to create it in json style?

You wouldn't even need to format json since [] (and {}) expressions could natively output arrays and objects as JSON. In memory they would be an object but when converted to string (such as an process writing to STDOUT) they'd get converted into a JSON document.

My thinking about supporting commas was just to make copy/pasting easy. Eg you have some JSON from some text file and you want to quickly dump it into the terminal. Very much an edge case but it would be trivial to support both commas and white space.

As an aside, I've gotten the new expressions library mostly working now. It supports:

numeric data
strings
boolean
parentheses (sub-expressions)
all comparison operators (>, == et al, plus I've added ~~ to support comparing similar data such as "5" ~~ 5, "True" ~~ true, "LOWERCASE" ~~ "lowercase")
regexp matches
assigning to variables

Support pending:

null type
arrays
maps
=+ et al (= has been added so the rest of the work here will be easy)
${} to execute a command who's output is evaluated as part of the expression. eg bob = ${command}

The new expressions can be invoked either via

exp command (this is only a stop gap while I test it)
assigning directly in code: foobar = "foo" + "bar"
$() eg echo $(2+2) (planned but not yet written). The idea being ${} executes a command and $() executes an expression.

My only concern with this is that the distinction between ${} and $() is obvious to anyone who hasn't read the source code...?

edit: these changes are currently sat on my dev machine. I haven't even pushed them to the develop branch yet.

lmorg commented 1 year ago

Thinking about the different parsers a little more, having %{} and @[] when in ${} but {} and [] in $() seems very confusing. So that's a strong argument in favour of having arrays defined as barewords.

However that introduces a new problem. How do you differentiate between a code block as a parameter and a map/object as a parameter. eg

if { code block } then { code block } else { code block }

command { "key": $value }

I might be panicking over nothing though. just trying to explore all angles before I accidentally create backwards incompatible tech debt :)

Pascalio commented 1 year ago

You're definitely right to think things through instead of being sorry afterwards.

For the reasons you underlined, I would favour your first solution:

cmd [ this is an array ]
cmd "[ this is a string ]"

However that introduces a new problem. How do you differentiate between a code block as a parameter and a map/object as a parameter. eg This would actually be avoided if maps and arrays share the same format, wouldn't it?

if  { code block }

command [ key1: value1 key2: value2 ] # map
command [ 0: value1 1: value2 ] # explicitly-mapped array
command [ value1 value2 ] # implicitly-mapped array
command "[ value1 value2 ]" # just an old string

To answer your question about expressions: it isn't obvious to me who haven't read the code. But I could take an educated guess: commands are the very first item of a line, whereas an expression combines items with some combining symbol in between, not in the first position. Based on this, ${} or $() should be chosen. Right? That said, I'm afraid my gut reaction, if I were discovering a language having this kind of syntax distinction, wouldn't be very positive. The reason is that I'm personally looking for a grammar that has simple building blocks which can be stacked and composed with flexibility, and it wouldn't be apparent at first glance why 1 + 2 is a totally different world from echo Hi. Highly case-specific grammar is what put me off oilshell for example (together with it eating my bath towels...) But then, in the end, it's all about pedagogy and conviction: I'm sure you have strong reasons supporting this, just believe in it, and a good explanation in the docs will content simplicity-seeking gut reactions like mine!

lmorg commented 1 year ago

I've been thinking about this for a while and I believe there are a few insurmountable problems preventing the adoption of expressions as the primary syntax for this shell:

murex has already adopted a functional paradigm. This limits the amount of expression infixing one can do without casting some syntactic sugar, which is effectively declaring a new function call.
murex already supports bareword strings. The original version of murex (around a decade ago, and before it was even called "murex") had syntax a lot closer to what we're trying to emulate here but it also requires executables to be encapsulated in parenthesis and strings to be quoted. This made for a really crappy interactive shell. As I made concessions (eg making quotes optional) the design slowly became more akin to what we have now and how more traditional shells work. Ultimately there is a trade off between the scripting side of things and the REPL (funny enough I wrote about this last year: https://murex.rocks/docs/blog/split_personalities.html)
Parenthesis used as quotation marks. In hindsight this is one of the kludgier ideas I've implemented as it now limits some of the easier solutions (ie "just wrap expressions in parenthesis like you would naturally do in expressions anyway). However having a quotation mark that is different symbols for opening and closing a string does bring some genuine enhancements since having nested quotes is a depressingly common problem in shell scripting. So using () as an additional way to quote strings does save murex from the \\\\ hell that Bash (et al) suffer from.
Backwards compatibility with POSIX. While the shell itself isn't POSIX compatible, it's designed to work with all of Linux / UNIXes ecosystem seamlessly. PowerShell (for example) might get away with richer choices syntactically, but it needs wrappers or reimplementations for all of the existing core utils that people on *nix platforms use daily. murex has no such limitation but the trade off is that all function parameters are natively strings. So even if there were an easy was to have expressions sit natively as a parameter, any non-string type in a function's parameters would need to be converted to a string. Less of an issue for numeric types but a real problem for maps/dictionaries/objects as those would ultimately need to be transpiled to (for example) JSON. Having that behaviour at definition time (eg when you create the map/dictionary/object) does at least make that behaviour explicit rather than implicit.

None of this is to say we cannot find a solution that compromises without feeling like it is making (undue) concessions. Just that the more I think about this problem, the more additional problems I find with any of the potential solutions. So I'm still very much at the research phase for "how to implement this the best way".

Anyway, sorry for the monologue. I wanted to get some thoughts jotted down while they were fresh in my head :)

lmorg commented 1 year ago

I've had another thought. Not 100% sure I like this but putting it out there for consideration:

how about % to build arrays and dictionaries?

If this is used in expressions it does lead to slightly uglier syntax, eg

» array = %[ foo bar baz ]
» dic = %{ foo: bar, hello: world }

but it would then allow commands/statements to adopt the same code:

» out %{ foo: bar, hello: world }
{
    "foo": "bar",
    "hello": "world"
}

So there isn't any mental switching between expressions and commands. % would become a generalised "structure builders".

(edit: since the code is already written to support » foo = [ bar ] (without %) expressions, I could keep that around as an "undocumented feature", but the published standard, if we were to proceed this way, would be %[])

Additionally mkarray (https://murex.rocks/docs/commands/a.html) can be wrapped into @[] too (basically anything inside @[] that includes .. as a bareword will be treated as a mkarray call)

This is perhaps the easiest way to bring your excellent points into reality without breaking backwards compatibility nor (hopefully) compromising too much on your desire for reducing cognitive overhead.

The big next problem is the remainder of the expressions. eg

# easy to identify as an expression because 2nd parameter is an `=`
» bob = 1 + 2

whereas

# do I want to output 3 or "1 + 2"?
» out 1 + 2

# is this assigning "/dev/sda" to $if?
» dd if=/dev/sda of=/dev/sdb

# is this assigning 2 to $maxdepth?
» find / -maxdepth=2

I think this is solvable with a lot of parser logic because

- is an invalid variable name (so -maxdepth would be an invalid token),
strings need to be quoted in expressions (so /dev/sda/ would be an invalid token)

But then there's the issue of whether out 1 + 2 should be valid?

I could argue that spaces in expressions when used in commands are invalid. But that introduces a number of new issues:

» bob = 1 + 2 is now valid but » out 1 + 2 is not. When is it safe to add whitespace and not safe to?
If I ever want to add a mod operator (typically %) then I cannot because while » bob = %[] is explicit, » out bob=%[] is not.

Another option is I just say assignments are not allowed in commands. Which solves all the instances where = would be embedded. And to be honest it also potentially saves future murex users from reading back weird shell code where parameters have side effects with variables.

However parsing "whitespaced" expressions as parameters is still not an easy problem. eg:

» command foo 1 + 2 bar

...should 1 + 2 be parameter 3 for command? It's pretty explicit that foo is the first parameter and bar is the last. But less clear how 1 + 2 should be interpreted.

So there's still an argument for disallowing whitespace in expressions in parameters.

(edit2: another option is that the same ${} subshell is used to embed expressions. eg » out ${1 + 1}. I'd then need to write support so that all expressions can be used (eg » 1 + 1 instead of having a special exception for » result = 1 + 1 formats, but that is a lot easier than solving all of the issues above and would result in a lot more uniform syntax)

A lot to think about, but it feels like we're inching towards a design that works :)

Pascalio commented 1 year ago

Thank you for the extensive explanations! I've read your article on split personalities and things are getting clearer. :)

2. murex already supports bareword strings. The original version of murex (around a decade ago, and before it was even called "murex") had syntax a lot closer to what we're trying to emulate here but it also requires executables to be encapsulated in parenthesis and strings to be quoted. This made for a really crappy interactive shell. As I made concessions (eg making quotes optional) the design slowly became more akin to what we have now and how more traditional shells work. Ultimately there is a trade off between the scripting side of things and the REPL (funny enough I wrote about this last year: https://murex.rocks/docs/blog/split_personalities.html)

Indeed, current murex makes for a much more pleasant REPL. (One of the most pleasant ones too imo!)

3. Parenthesis used as quotation marks. In hindsight this is one of the kludgier ideas I've implemented as it now limits some of the easier solutions (ie "just wrap expressions in parenthesis like you would naturally do in expressions anyway). However having a quotation mark that is different symbols for opening and closing a string does bring some genuine enhancements since having nested quotes is a depressingly common problem in shell scripting. So using () as an additional way to quote strings does save murex from the \\\\ hell that Bash (et al) suffer from.

I love parentheses as quotation marks. They're indeed much clearer than ". But yeah, if '()' denotes a string, then it can't denote an expression.

Your very last option feels very good to me: it is high-level and unified. So, if I understand correctly:

1 + 1 # 2
command ${command}
command ${expression}

result = 1 * 0   # direct assignment

# What about commands output capture on the right side of "="?
content = cat file -> head # or
content = ${cat file -> head} # ?
# Just wondering, I don't have any opinion about this at the moment.
cat file -> head -> (set) content  # remains one of murex's greatest syntax in my view!

And yes, I'm not sure about the benefits of variable assignment within command calls... But given the dangers and downsides, you'd be right to disallow them, I think. Variables should always be assigned or mutated in a very visible way.

As to %, yes, it sounds good. As long as we can do

foo = "bar"
arr = %[ baz $foo ]
dict = %{ key1: $arr, $foo: booze }

Additionally mkarray (https://murex.rocks/docs/commands/a.html) can be wrapped into @[] too (basically anything inside @[] that includes .. as a bareword will be treated as a mkarray call)

Didn't get you there. You mean we can't %[ 1..5 ] but we can %[ @[1..5] ] Another daft question: why does % work where single [ didn't? Not a rhetorical "upset" question, it's just that I'm not sure to understand everything. :) Is it because cmd [array] would be ambiguous with cmd "[string]" ? Or is it because cmd {dict} breaks { literality and cmd [dict] isn't a good option?

lmorg commented 1 year ago

@Pascalio

Indeed, current murex makes for a much more pleasant REPL. (One of the most pleasant ones too imo!)

Thank you <3

I love parentheses as quotation marks. They're indeed much clearer than ". But yeah, if '()' denotes a string, then it can't denote an expression.

I'm tempted to say () for strings should be %(), to bring it in line with %[] for arrays and %{} for objects/maps/hashes/dictionaries...whatever term you prefer. (in the source code I refer to them as "objects" but that might be confusing due to how it's understood in object orientated programming. So might also take this opportunity to come up with a more sensible term to use consistently in the code and documentation).

Your very last option feels very good to me: it is high-level and unified. So, if I understand correctly:

You understood perfectly in those examples :) In fact that code is now working: (should be pushed to the develop branch shortly but still very beta!)

What about commands output capture on the right side of "="?
content = cat file -> head # or
content = ${cat file -> head} # ?

It would have to be the latter because the former would introduce too much:

change: which could break existing code,
and uncertainty: eg is that assigning to a parameter of a command or it's STDIN? or is that just a variable that happens to share the same name as command?

So you do still end up with a two tiered language syntax (which will need to be documented clearly) but at least you don't need to worry about which is the correct symbol to invoke expressions vs commands. you can just treat them as part of the same language albeit different nuances within that language. Though happy to take any further opinions into consideration.

As to %, yes, it sounds good. As long as we can do

Certainly can :) I haven't yet written the parser for dicts yet though.

Didn't get you there. You mean we can't %[ 1..5 ] but we can %[ @[1..5] ]

I mean %[ 1..5 ] will (eventually, haven't yet written the code) be supported so you're not having to nest blocks like %[ @[1..5] ]. In fact %[] will become the de facto way to call a in the future.

Another daft question: why does % work where single [ didn't? Not a rhetorical "upset" question, it's just that I'm not sure to understand everything. :)

That's a fair question.

There's a few reasons behind that decision:

{} is a string literal and was designed that way intentionally to make parsing shell scripts easier where you have nested code blocks. (eg function foobar { out $foo }, I didn't want to prematurely expand the variable inside the function). This made sense when I was rapidly knocking up this shell as a proof of concept ~10 years ago and it could be argued that the shell has now out grown that requirement. But if I can avoid having to spend a few weeks rewriting the main parser and all its tests then I'd prefer that. At some point that piece of work is going to need to happen anyway, but for now there's much more low hanging fruit that i can spend my time on (like your other suggestions, as it happens)
I'm not really a fan of using [] for both dicts and arrays. I think it's not very clear if [ 1: 2, 3: 4 ] is an array or dict. At least not when glossing over code. Also since arrays and dicts are displayed as JSON when converted to a string. It makes sense to have their syntax loosely match JSON too.
The biggest blocker though, was that [ filter ] is already a command. So it would mean you could only use [] for arrays if you were to assign (rather than use them as a function), eg arr = [ 1 2 3 ] would create an array but [ 1 2 3 ] would filter it. Which is rather hard to reason about at a glance.
so having % as a default prefix for "creating new things" becomes explicit. eg %[] creates an array, %{} creates a dict, and %() creates a string. It should then (hopefully) be clear at a glance what these brackets are doing.

That's my thinking anyways.

lmorg commented 1 year ago

Added support for dicts %{}

Added support for %[[..]] (currently needs to be double square brackets because a (et al) actually behave that way, eg

a: [1..3]bob

# Outputs
# 1bob
# 2bob
# 3bob

would be the same as %[[1..3]bob]

but I plan on fixing adding a little extra logic that allows of %[..] as a shorthand for %[[..]] in future patch)

lmorg commented 1 year ago

Added patch to support %[..] (as per https://github.com/lmorg/murex/issues/485#issuecomment-1339967448)

I'm pretty close to having a this ready for a release. Though I want to keep it in develop for a couple of weeks longer to beta test further.

Pascalio commented 1 year ago

Hi @lmorg ! Sorry for the silence, I was off on the other hemisphere for a few days... This looks wonderful! I wish I could contribute with coding and not only opinions, but my go is a bit of a no-go...

I'm tempted to say () for strings should be %(), to bring it in line with %[] for arrays and %{}

Sounds good.

objects/maps/hashes/dictionaries...whatever term you prefer.

Looks like you've gone for "dictionaries". I don't have any opinion about this... Except that "map" is shorter. But dictionary is certainly fine.

so having % as a default prefix for "creating new things" becomes explicit. eg %[] creates an array, %{} creates a dict, and %() creates a string. It should then (hopefully) be clear at a glance what these brackets are doing.

Good point, things are clearer this way.

I'm anxious to try the new code. Will maybe try to compile from develop.

lmorg commented 1 year ago

@Pascalio

This looks wonderful! I wish I could contribute with coding and not only opinions, but my go is a bit of a no-go...

Honestly, the feedback you've provided has been invaluable. Worth just as much, if not more so, than any code pull requests.

I'm anxious to try the new code. Will maybe try to compile from develop.

Please do give develop a try. I've just dropped a massive update into develop which sees the main parser completely rewritten to integrate the new syntax changes (so the expressions isn't just an adhoc bolt on).

So there may well be some breaking changes where I haven't entirely copied across the parsing rules correctly (the test suite has caught most of the problems but worth running the dev build for a little while to see if any others pop up).

I haven't (yet) rewritten the fuzzers for the new parsers so there might be some edge cases that can cause a crash. That's also on my TODO.

Line numbers will definitely be reported wrong though. That's still a work in progress.

I also need to get the documentation updated.

Pascalio commented 1 year ago

Brilliant! I'll give it a good try, thank you!

lmorg commented 1 year ago

Awesome, thanks @Pascalio . I'm pushing updates pretty much daily at the moment so if there's an issue you've ran into there's a chance I might have already fixed it.

I'm going to aim for a release on ~1st Jan. So let me know how you get on. And if you don't get time before then, then that's fine. I appreciate how busy every one is. Particularly at this time of year.

Pascalio commented 1 year ago

Season's greetings @lmorg ! Have just grabbed the develop version. Everything's smooth so far. I'll be reporting in separate reports if needed!