rspamd / rspamd

Rapid spam filtering system.
Other
2.02k stars 378 forks source link

[BUG] gpt.lua plugin fails at line 114 due to nil value passed to math.abs function #5065

Open Ne0os opened 1 month ago

Ne0os commented 1 month ago

Prerequisites

Describe the bug There is an error in the gpt.lua plugin where the math.abs function is called with a nil value, leading to a failure in the GPT_CHECK function. The issue occurs because the symbol weight retrieved from task:get_symbol(s) can be nil.

Steps to Reproduce

  1. Configure Rspamd to use the gpt.lua plugin.
  2. Trigger an email scan where a symbol specified in settings.symbols_to_except is present, but its weight is not set or is nil.
  3. Observe the error in the logs: bad argument #1 to 'abs' (number expected, got nil).

Expected behavior The script should handle cases where sym or sym.weight is nil and not attempt to call math.abs on a nil value.

Versions

Rspamd version: Rspamd daemon version 3.9.0 OS: Debian GNU/Linux 11 (bullseye), x86_64

Additional Information

Example code causing the issue:

-- We also exclude some symbols
for s, required_weight in pairs(settings.symbols_to_except) do
  if task:has_symbol(s) then
    if required_weight > 0 then
      -- Also check score
      local sym = task:get_symbol(s)
      -- Must exist as we checked it before with `has_symbol`
      if math.abs(sym.weight) >= required_weight then
        return false, 'skip as "' .. s .. '" is found (weight: ' .. sym.weight .. ')'
      end
      lua_util.debugm(N, task, 'symbol %s has weight %s, but required %s', s,
          sym.weight, required_weight)
    else
      return false, 'skip as "' .. s .. '" is found'
    end
  end
end

Proposed fix to handle nil values:

-- We also exclude some symbols
for s, required_weight in pairs(settings.symbols_to_except) do
  if task:has_symbol(s) then
    if required_weight > 0 then
      -- Also check score
      local sym = task:get_symbol(s)
      -- Check if sym is not nil and has a weight
      if sym then
        if sym.weight then
          if math.abs(sym.weight) >= required_weight then
            return false, 'skip as "' .. s .. '" is found (weight: ' .. sym.weight .. ')'
          end
          lua_util.debugm(N, task, 'symbol %s has weight %s, but required %s', s,
              sym.weight, required_weight)
        else
          -- Log if sym.weight is nil
          lua_util.debugm(N, task, 'symbol %s is found, but weight is nil', s)
        end
      else
        -- Log if sym is nil
        lua_util.debugm(N, task, 'symbol %s is found, but sym is nil', s)
      end
    else
      return false, 'skip as "' .. s .. '" is found'
    end
  end
end

This enhancement ensures that sym and sym.weight are checked for nil values before calling math.abs, preventing the error.

vstakhov commented 1 month ago

That's a good catch, thank you. However, it seems the logic is still the same with this bug: if we got a symbol from exclude_symbols we do not ask gpt in that case. We should definitely do it in a more graceful way though.

Ne0os commented 1 month ago

Thank you for the swift update and the changes to the function. The additional checks for sym and sym.weight are certainly helpful.

However, I noticed an issue with the sym.weight value as shown in the following log lines:

 bayes; bayes_classify: got ham probability -161.01 -> 0.00 and spam probability -26.97 -> 1.00, 68 tokens processed of 187 total tokens; 63 text tokens found of 169 text tokens
gpt; gpt.lua:116: symbol BAYES_SPAM has weight nil, but required 0.9

In this case, the BAYES_SPAM symbol shows a nil weight instead of the expected value of 0.99. This suggests that there may be an issue with how the weight is being retrieved or set.

The updated function handles nil values gracefully, but it seems that sym.weight occasionally does not reflect the correct value. This discrepancy could indicate a problem in the symbol’s data retrieval or processing.

I temporarily added a log line that logged sym to help identify the issue:

gpt; gpt.lua:110: Symbol details: {[1] = {[options] = {[1] = 99.99%}, [score] = 5.099999998407408, [groups] = {[1] = statistics}, [weight] = 0.9999999996877271, [group] = statistics}}

Changes Made:

local sym_details = sym[1]
if sym_details and sym_details.weight then
  if math.abs(sym_details.weight) >= required_weight then
    return false, 'skip as "' .. s .. '" is found (weight: ' .. sym_details.weight .. ')'
  end
end

I'm not a Lua expert, so I won't be making a pull request and I'm not entirely sure about my code.

Thank you again for your prompt response and efforts in addressing this issue. Your work is greatly appreciated!