Closed watery closed 10 months ago
Hi @watery,
So, what you want is achievable in a couple of ways. In Varnish Enterprise, there's vmod-xbody with .get_req_body(), which could get you that.
On the open source side, you have a bunch of options:
rust
vmod, so if you are using
varnish` 6.X, it won't work, but I'll happily help to if you need itBut, in truth, you want none of that, you want vmod-jq, that will allow you to access those fields without mucking around with regexes
:-)
Hopefully that helps. I'm closing this as there are a bunch of options out there, so we probably won't invest time into building this feature, but we'll welcome a PR if one comes.
For your own idea of using a regular expression, the re vmod pretty much does exactly what you want.
But the way I read your question, you are actually receiving Content-Type: application/json
requests which you want to allow and Content-Type: multipart/form-data
which you want to deny. So maybe you could just make the decision based on the header?
As @gquintard pointed out, using a regex on JSON is not a great idea, because regexes do not properly parse the structured data. For example, "customerId=(\d+)"
would also match a json string as here: {foo: "customerId=42"}
. So the fail-safe recommendation is to parse json as such, but regexen are suitable for form-data.
Regarding JSON, the jq library used by by vmod jq was way too slow for my purposes, so I wrote vmod frozen, which is much faster. I never ran a benchmark, but it never showed up as an issue with our clients who process tens to hundreds of thousands of requests per second.
First of all, thank you both @gquintard and @nigoroll for your very quick replies! Next, I'm on OSS 7.4 - I've just edited my opening post - so anything in the Enterprise version is off for me.
Let me say that I really installed Varnish in the last weekend so pardon me for any misuse of any technical term and I still have to explore all the available VMODs.
Let me add some clarifications.
But the way I read your question, you are actually receiving Content-Type: application/json requests which you want to allow and Content-Type: multipart/form-data which you want to deny. So maybe you could just make the decision based on the header?
All the requests should be handled / allowed, there's none that should be blocked or discarded. The different content-types were just examples from the requests I'm interested in, in particular the one that handles multipart/form-data
is actually receiving a new image for the product, and that should ban / purge the cached copy of the requests that return products info.
As @gquintard pointed out, using a regex on JSON is not a great idea, because regexes do not properly parse the structured data. For example, "customerId=(\d+)" would also match a json string as here: {foo: "customerId=42"}. So the fail-safe recommendation is to parse json as such, but regexen are suitable for form-data.
Totally agree. The idea to use a regular expression was just a simple elaboration on the INT rematch_req_body(REGEX re)
function that's already in bodyaccess
to try to explain my requirement, but sure if there are VMODs more targeted at handling JSON it's best to use them.
The main point though, maybe I didn't explain it well, is that I need to process the client request body i.e. the req
in vcl_recv()
. As I understand from the official docs, the body isn't available in that step, and - I admit I just skimmed throught all your VMOD linked pages, so maybe I overlooked that - I didn't see where / how I can read it.
I found BOOL cache_req_body(BYTES size) but then?
On the open source side, you have a bunch of options:
* https://code.uplex.de/uplex-varnish/libvmod-re/ , it might be a bit cumbersome for what you want seek, but I sure you can make it work, and @nigoroll will happily help you I'm sure * https://github.com/gquintard/vmod_rers/blob/main/vmod.vcc#L33-L45 will also do what you ask, the only trick is that it's a `rust` vmod`, so if you are using `varnish` 6.X, it won't work, but I'll happily help to if you need it
But, in truth, you want none of that, you want vmod-jq, that will allow you to access those fields without mucking around with
regexes
:-)
Oh, rereading more closely: the first two in fact do regex's on the request body. So... were you suggesting to use one of them to extract the whole body and then pass it to vmod-jq?
@watery , I would go with vmod-jq
personally, only because I'm familiar with jq
itself and because the API is dead simple (as for the others, you can parse the request body directly), but if you need more performance (do test, do measure), then graduate to faster solutions
I'm new to Varnish, installed 7.4 OSS, and need to cache some POST requests. I found this repo which has a bodyaccess VMOD, but if I understand it correctly, the vmod only allows checking whether some text (regexp) is present in the body, but not to extract it.
I have to cache requests like this, that returns info about product
11011
:Those requests should be banned when a request like this one is received:
How can I address such cases? Can I request to extend bodyaccess to add a function to retrieve a portion of the body, like:
STRING reextract_req_body(REGEX re, STRING pattern)
reextract_req_body("customerId=(\d+)", "\1")