binref / refinery

High Octane Triage Analysis
Other
635 stars 63 forks source link

Usage example for Push does not work #7

Closed baderj closed 2 years ago

baderj commented 2 years ago

I can't get the usage example for push to work:

emit key=value | push [[| rex =(.*)$ $1 | pop v ]| repl var:v censored ]

First off, the rex syntax seems to have changed, the $1 will be taken literally:

❯ r.emit "key=value" | r.rex '=(.*)' '$1' 
$1

Instead it should probably be {1}:

❯ r.emit "key=value" | r.rex '=(.*)' '{1}' 
value

But even after correcting this it still fails because the variable is not defined:

> r.emit "key=value" |  r.push -vv [ [ | r.rex -vv '=(.*)' '{1}' | r.pop -vv v ] | r.repl --v 'var:v' censored ]

(07:30:25) verbose in rex: regular expression: re.compile(b'=(.*)', re.DOTALL)
(07:30:25) verbose in pop: buffering invisible chunk
--- Logging error ---
Traceback (most recent call last):
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/argformats.py", line 621, in extract
    result = meta[name]
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 500, in __getitem__
    item = super().__getitem__(key)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 511, in __missing__
    raise KeyError(F'The meta variable {key} is unknown.')
KeyError: 'The meta variable v is unknown.'

Maybe I'm doing the whole variables thing wrong, because even this fails becase k is not set:

> r.emit "test" | r.put -vv k "test" | r.repl -vv var:k hello
Message: 'delayed argument initialization failed:'
Arguments: ('The variable k is not defined.',)

I'm using version Version: 0.4.6 of binary-refinery, on Ubuntu with bash.

baderj commented 2 years ago

I also tried some examples from the documentation of meta variables, for example:

r.emit "FOO" [| r.put -vv x "BAR" | r.cca -vv var:x ]] 

This just throws an exception:

Traceback (most recent call last):
  File "/home/zac/.local/bin/r.cca", line 8, in <module>
    sys.exit(cca.run())
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1669, in run
    source | unit | output
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1312, in __or__
    for last, chunk in lookahead(self):
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/tools.py", line 35, in lookahead
    peek = next(it)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1151, in __next__
    self._chunks = iter(self._framehandler)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1176, in _framehandler
    self._framed = Framed(
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 466, in __init__
    self.unpack = FrameUnpacker(stream)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 376, in __init__
    self._advance()
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 389, in _advance
    self._next = Chunk.unpack(stream)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 304, in unpack
    return cls(data, path, view=view, meta=meta, fill=fill)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 200, in __init__
    m.update(meta)
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 289, in update
    self.fix()
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 292, in fix
    for key, value in self.items():
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 312, in items
    if not is_valid_variable_name(key):
  File "/home/zac/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 177, in is_valid_variable_name
    if not name.isidentifier():
AttributeError: 'bytes' object has no attribute 'isidentifier'
baderj commented 2 years ago

I think I found the problem: When the var handler wants to extract the variable from the chunk meta, it searches for the variable name as a string name = 'x'. However, printing the data reveals the variable name is stored in binary format:

meta = {b'x': b'BAR'}

Because "x" is not the same as b"x", this raises a KeyError

result = meta[name]
huettenhain commented 2 years ago

Problem 1

You are correct that this one uses outdated syntax:

emit key=value | push [[| rex =(.*)$ $1 | pop v ]| repl var:v censored ]

And you are also correct that this one will crash:

r.emit "key=value" |  r.push -vv [ [ | r.rex -vv '=(.*)' '{1}' | r.pop -vv v ] | r.repl --v 'var:v' censored ]

However, the second one crashes because you put a space between the two opening square brackets, and refinery will only consider the very last command line argument token for frame nesting. So this one will work:

r.emit "key=value" |  r.push -vv [[ | r.rex -vv '=(.*)' '{1}' | r.pop -vv v ] | r.repl --v 'var:v' censored ]

And indeed it does:

(venv) [22:43:50][rattle@misato:/tmp]
$ r.emit "key=value" |  r.push -vv [[ | r.rex -vv '=(.*)' '{1}' | r.pop -vv v ] | r.repl --v 'var:v' censored ]
(10:46:16) verbose in rex: regular expression: re.compile(b'=(.*)', re.DOTALL)
(10:46:16) verbose in pop: buffering invisible chunk
key=censored

I fixed the documentation issue where I still had the old syntax for rex; https://github.com/binref/refinery/commit/8646522b574fe052f337bfe5358318547c39c691 should take care of this. For now, I will assume that problem 1 is solved.

Problem 2

Regarding the second problem, I cannot reproduce it successfully:

$ uname -a
Linux misato 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
(venv) [22:42:45][rattle@misato:/tmp]
$ r.emit "FOO" [| r.put -vv x "BAR" | r.cca -vv var:x ]]
(10:42:46) verbose in put: storing bytes: BAR
FOOBAR

Can you try to update to see if that fixes it? If no, would you mind installing refinery into a separate virtual environment and see if you can reproduce it there? If yes, can you provide a list of installed packages so I can try to recreate your environment as accurately as possible?

baderj commented 2 years ago

Thanks a lot for the documentation fix and the helpful hints. I was able to fix the problem, which was caused by an older msgpack version)

I updated to the lastest version on Git. Both commands run successfully in the virtual environment, but both r.emit "key=value" | r.push -vv [[ | r.rex -vv '=(.*)' '{1}' | r.pop -vv v ] | r.repl --v 'var:v' censored ] and r.emit "FOO" [| r.put -vv x "BAR" | r.cca -vv var:x ]] raise the same exception (which again is related to the variable name being bytes instead of string), when not using the virtual env:

> emit "FOO" [| put -vv x "BAR" | cca -vv var:x ]]

(12:53:57) verbose in put: storing bytes: BAR
Traceback (most recent call last):
  File "/home/username/.local/bin/r.cca", line 8, in <module>
    sys.exit(cca.run())
  File "/home/username/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1669, in run
    source | unit | output
  File "/home/username/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1312, in __or__
    for last, chunk in lookahead(self):
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/tools.py", line 35, in lookahead
    peek = next(it)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1151, in __next__
    self._chunks = iter(self._framehandler)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/units/__init__.py", line 1176, in _framehandler
    self._framed = Framed(
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 466, in __init__
    self.unpack = FrameUnpacker(stream)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 376, in __init__
    self._advance()
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 389, in _advance
    self._next = Chunk.unpack(stream)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 304, in unpack
    return cls(data, path, view=view, meta=meta, fill=fill)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/frame.py", line 200, in __init__
    m.update(meta)
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 289, in update
    self.fix()
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 292, in fix
    for key, value in self.items():
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 312, in items
    if not is_valid_variable_name(key):
  File "/home/username/.local/lib/python3.8/site-packages/refinery/lib/meta.py", line 177, in is_valid_variable_name
    if not name.isidentifier():
AttributeError: 'bytes' object has no attribute 'isidentifier'

Here is the list of installed Python modules in the virtual environment that are different from the global ones:

mod venv global
altgraph 0.17.2 0.17
backports.zoneinfo 0.2.1 n/a
cffi 1.15.0 1.14.0
click 8.0.3 n/a
colorama 0.4.4 0.4.3
cryptography 35.0.0 3.3.1
macholib 1.15.2 1.15
msgpack 1.0.2 0.6.2
openpyxl 3.0.9 3.0.7
pefile 2021.9.3 2019.4.18
Pillow 8.4.0 8.2.0
pkg-resources 0.0.0 n/a
py7zr 0.16.2 0.16.1
pycryptodome 3.11.0 3.10.1
pycryptodomex 3.11.0 3.10.1
pyppmd 0.17.1 0.16.1
python-magic 0.4.24 0.4.18
pytz-deprecation-shim 0.1.0.post0 n/a
pyzstd 0.15.0 0.14.4
setuptools 44.0.0 45.2.0
six 1.16.0 1.15.0
texttable 1.6.4 1.6.2
toml 0.10.2 0.10.1
tzdata 2021.5 n/a
tzlocal 4.0.1 2.1
wheel 0.37.0 0.36.2
xdis 5.0.4 5.0.11

The problem turned out to be msgpack. My outdated version 0.6.2 did this:

>>> data = {"x": b"FOO"}
>>> unpacker.feed( msgpack.packb(data) )
>>> item = next(unpacker)
>>> item
{b'x': b'FOO'}

After upgrading everything works. Maybe you could set a version requirement msgpack >= 1.0.0?

huettenhain commented 2 years ago

Ah, awesome, thank you so much for already figuring this out! You're right, adding a version requirement for msgpack is a good idea.

huettenhain commented 2 years ago

Do you think that 1cc7141 is sufficient to close this out? I should again stress I appreciate the work you put into tracking down the issue.

baderj commented 2 years ago

Yes, thank you very much, that solves the problem. To test, I went back to the older, buggy version of msgpack and then reinstalled binary-refinery from 1cc714184d98ac4ad4ca95e366f27b03ec6b64de . As expected, that did upgrade msgpack to the latest version and everything works.