projectdiscovery / httpx

httpx is a fast and multi-purpose HTTP toolkit that allows running multiple probes using the retryablehttp library.
https://docs.projectdiscovery.io/tools/httpx
MIT License
7.47k stars 816 forks source link

DSL filter not working correctly in specific cases #1887

Open choket opened 3 weeks ago

choket commented 3 weeks ago

httpx version: v1.6.8

Current Behavior:

The dsl filter in the command httpx -silent -location -title -fdc 'starts_with(location, "https://") || starts_with(title, "Google")' <<< 'www.google.com' does not work. Specifically, the starts_with(title, "Google") part does not filter out pages whose title is "Google"

Expected Behavior:

httpx -silent -location -title -fdc 'starts_with(location, "https://") || starts_with(title, "Google")' <<< 'www.google.com' should return no output

Steps To Reproduce:

Running httpx -silent -location -title <<< 'www.google.com' tell us that www.google.com doesn't return a Location: header, and that its title is 'Google'

> httpx -silent -location -title  <<< 'www.google.com'
https://www.google.com [] [Google]

Next, running the same query as above but with the dsl filter starts_with(title, "https://") || starts_with(title, "Google"), httpx correctly returns no output.

> httpx -silent -location -title -fdc 'starts_with(title, "https://") || starts_with(title, "Google")' <<< 'www.google.com'
>

However, when using the dsl filter starts_with(location, "https://") || starts_with(title, "Google")' <<< 'www.google.com, httpx does not correctly filter out the response with the title 'Google'

> httpx -silent -location -title -fdc 'starts_with(location, "https://") || starts_with(title, "Google")' <<< 'www.google.com'
https://www.google.com [] [Google]
Mzack9999 commented 3 weeks ago

This happens because the expression fails to evaluate as there is no value for location. DSL filters are very simple by nature, and location is a key that might appear in the response or not. If the filter is composite, you can obtain the same result via jq:

httpx -silent -json | jq 'select((has("location") and (.location | type == "string") and (.location | startswith("https://"))) or (has("title") and (.title | type == "string") and (.title | startswith("Google"))))'
choket commented 3 weeks ago

Thanks for the answer but I don't really want to output to JSON and then filter that, I like httpx's coloured output.

I'm not too familiar with how the DSL internals work, but would adding an empty location to resultMap in https://github.com/projectdiscovery/httpx/blob/main/runner/types.go#L125 work as expected?

choket commented 3 weeks ago

After having a futher look at the code, I can see that the result struct that's passed to evalDslExpr in https://github.com/projectdiscovery/httpx/blob/d58ad9d4c958d00394b6206a8074836117f20b56/runner/types.go#L124 does have an empty value for location as well as all other dsl variables.

However, when the result struct is converted to a map via resultToMap, the empty fields in result are omitted and not present in the resulting map. This is because omitempty is specified in almost all properties of the Result struct in https://github.com/projectdiscovery/httpx/blob/d58ad9d4c958d00394b6206a8074836117f20b56/runner/types.go#L34

My question is, is there a reason why omitempty is specified? Can it be safely removed to make DSL filtering and matching work as expected?