Open cookyr opened 5 months ago
Weird--it works for me, although your regex doesn't actually do anything for that particular input string (i.e. I get yy
as 2099-05-4-kaese
, since the regex doesn't match and doesn't substitute anything, and so the title is set to test-2099-05-4-kaese-end
).
Could you set the verbose-level to DEBUG
and check the logs again? You can do this one of two ways:
docker-compose.env
file, add the line PNGX_POSTPROCESSOR_VERBOSE=DEBUG
, (e.g. right below the PAPERLESS_POST_CONSUME_SCRIPT=...
line you added to hook the postprocessor), or--verbose DEBUG
There's going to be a lot of logs (it's very verbose), but the interesting line will probably start with Updating 'yy' using template {{ "2099-05-4-kaese" | regex_sub(".(\d{4}).","\1") }} and metadata...
Thanks for your answer!
There was a typo or copy/paste error in my pattern: 2 asterixs have disappaered.
It looks like this in my script:
yy: '{{ "2099-05-4-kaese" | regex_sub(".*(\d{4}).*","\1") }}'
and I expected yy yields "2099".
To me this was weird as the regex_match in the match section worked fine.
I'll try the debugging switches next time.
Best regards Rüdiger
Aha! I tried it with your correct regex, and you're right, it matches. I think the issue is the \1
you're substituting--it's getting interpreted as a literal \1
, i.e. a single character (just like \n
is a single newline character, or \t
is a single tab character).
If we just do this in the Python interpreter
>>> import regex
>>> regex.sub(".*(\d{4}).*", "\1", "2099-05-4-kaese")
'\x01'
In other words, it's substituting with a literal \1
character, not the backslash-referenced first matching group, like it should.
The solution seems to be to write your substitution string as "\\1"
.
>>> regex.sub(".*(\d{4}).*", "\\1", "2099-05-4-kaese")
'2099'
If you can confirm that works, let me know and I'll close out this issue. 🙂
I want to use regex_sub to extract time information from filenames. But this function is not working for me, same for example.yml. I tried in the metadata_postprocessing section:
There's no error in webserver's log but document arrves in paperless as 'test--end' I also tried to use re.sub directly, as far as I understood jinja this could be feasible?!?
Any idea?
Best regards Rüdiger
Originally posted by @cookyr in https://github.com/jgillula/paperless-ngx-postprocessor/issues/3#issuecomment-2139432017