ISISComputingGroup / lewis-ess

Let's write intricate simulators!
GNU General Public License v3.0
21 stars 19 forks source link

Implement scanf for StreamAdapter's Cmd and Var #216

Closed MichaelWedel closed 7 years ago

MichaelWedel commented 7 years ago

We've been talking about this from the very beginning of the project - it would be much simpler for some cases to provide scanf-like patterns instead of regular expressions. It turns out that the author of the scanf-package has made it available on PyPI and it has been made Python 3 compatible:

https://pypi.python.org/pypi/scanf

There is also a function that translates the scanf-pattern into a regex, that would allow us to use it relatively easily without changing the code of Cmd and Var too much.

MichaelWedel commented 7 years ago

I started playing with this a bit yesterday and I'm not entirely sure what the best option is. There is scanf_compile in scanf, which transforms the (%f)-strings into a regular expression, so that does exactly what we need. But. It also transforms "normal" regular expressions to an escaped version of the regular expression. The other way around, strings containing %f are also valid regular expressions, so I think it's going to be very hard to have users specify either a regex or a scanf-string and autodetect the difference.

What scanf_compile also does is that it automatically creates a list of converters for the matched arguments - perfect for the conversion function list that we allow for Cmd and Var! So what we could do is implement a small wrapper class that does not do much except "not being a string" so that it can be recognized by Cmd and Var.

Then, users would specify something like this:

cmds = [
    Cmd(scanf('V=(%f)'))
]

And the rest will happen automatically (i.e. automatically use the conversion functions unless specified by the user). The exact naming of this class causes me a bit of trouble, I don't think scanf is a good name - do you have any other suggestion? For the sake of implementing it I'm just going to use some name.

MikeHart85 commented 7 years ago

Can't think of a better name either right now.

But another other option, to avoid additional classes, would be to require ^ and $ for regex and assume scanf if they are missing.

MichaelWedel commented 7 years ago

Interesting, had not thought of that, thanks! I'm not entirely sure it's safe though. We've been doing this ^...$ in all of our examples, but of course there can be regular expressions which follow some other pattern. Or it could be helpful to use $ in some of the scanf-patterns?

MikeHart85 commented 7 years ago

I can't think of a case where you would want regex but not ^...$, because of how commands come in one-by-one (based on terminators). You should always be interested in / know what the whole command looks like.

For scanf, I don't know of any examples of this, but you could imagine some devices use $ or ^ somewhere as part of the command. But it's very unlikely they would use both ^ at the beginning and $ at the end (that's just too distinctly a regex thing). On the off chance that they do, we could substitute %^ for ^ (%^...$ would no longer be a regex, but scanf instead... but first replace %^ with ^).

MichaelWedel commented 7 years ago

The command parsing of a device could be "greedy" in the sense that it doesn't care about "garbage" after the actual command...in that cause I guess you wouldn't want $.

So I'm still not entirely sure. I guess convenience is good and I'd like to have that. But if we can't guarantee that it's "safe" because the two "languages" have to be treated differently even though their syntaxes are very similar (and almost 100% overlapping) then I'd probably be okay with a little inconvenience.

MikeHart85 commented 7 years ago

What if we forget the $ and just focus on:

MikeHart85 commented 7 years ago

Well, and also...

a device could be "greedy" in the sense that it doesn't care about "garbage" after the actual command

...could also be accomplished by just consuming "garbage": ^my_value=([0-9]+).*$

MichaelWedel commented 7 years ago

I've pushed a branch where I implemented a class called fmt (format is a builtin function) and used it in example_motor. The T= command is now much cooler because it also accepts stuff like 54e-1 - I think we definitely want that :) But that's slightly beside the point...somehow I'm still reluctant in compromising by introducing special rules for both "languages".

To avoid overflow of fmt in every single Cmd for some devices, what do you think about the following. I'd assume that within a device, commands would be "regular" in that they would either all be scanf-format type patterns or regular expressions. What about providing an attribute in StreamAdapter like default_pattern_language or pattern_type? It would be regex by default (which would be implemented to preserve current behavior), but could easily be set to fmt. If only a few command use fmt or regex, they could still be specified explicitly.

MichaelWedel commented 7 years ago

I've pushed a branch with this idea: https://github.com/DMSC-Instrument-Data/lewis/tree/216_scanf_for_var_cmd

I'll try and open a PR later with some more explanations. In example_motor it's already used.