trusteddomainproject / OpenDKIM

Other
97 stars 52 forks source link

Verification of missing KeyTable entries gives incomprehensible error #164

Open bernd-wechner opened 1 year ago

bernd-wechner commented 1 year ago

This repo last touched (for a small typo fix) 5 years ago. So sadly, this looks like abandonware.

But I'm struggling to get it to work and found one person sharing my gripe here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=895308

merely 4 years old ;-).

But I have pilfered the title, and dropped the issue here.

Summary

$ /usr/sbin/opendkim -vv
opendkim: using default configfile /etc/opendkim.conf
opendkim: /etc/opendkim.conf: could not find valid key record "1" in KeyTable

This message is useless in the extreme. What is it looking up, and what did it find, and what was it expecting? Ii

thegushi commented 1 year ago

Okay, in the bug you're referencing, the person was using a SQL database. Are you? Can you paste your config file?

The person in the debian bug was asking for a rewording of the error (which is easy) but I cannot tell if you're having the same error.

bernd-wechner commented 1 year ago

the person was using a SQL database. Are you?

Yes indeed. And I have finally overcome all the issue I had, which didn't end at this one. I can share the config later (as I have no access to the server just now - but it uses a pgsql lookup). I found the cause of the message, as in the table had not been fully populated and lacked a path to the key file.

It could save such folk a lot of time if the error was simply clear and referred to the relevant config setting, and said what the lookup was expecting and didn't fine. In retrospect this is just part of a learning curve, I understand much better now than I did then what's going on and it all works (albeit I wouldn't call myself a pro yet, by any measure). Point being, I guess, that clear and instructive error messages really shorten those learning curves ... make diagnosis and correction faster and easier well before a support request is issued and will make providing support easier if it's shared.

As an aside, I had a similar messy time with the permissions and ownership of the key files. Quite frustrating because of the rather poor clarity of the errors. But I should split that to another related issue as I can summarise the experience well (and don't fully understand it yet alas, though have a working install now - which is nice).

thegushi commented 1 year ago

If you can tell me what you feel the errors should say (or at least, what error you're getting) I can try to improve it, or perhaps make it so the code spits out the filesystem error you've got is. (I.e. if it's not saying something like "error 13" we can at least try to do that).

Get me logs and suggestions, and I'll see what I can do.

bernd-wechner commented 1 year ago

Thanks kindly for the offer to improve these messages.

I'm familiar with three errors that I found unhelpful. The first read:

could not find valid key record "1" in KeyTable

And arose because the KeyTable was configured with dsn:pgsql: with the relevant elements being table=dkim?keycol=id?datacol=domain_name,selector,private_key_path.

Useful would be something like:

The KeyTable (setting|property|configuration|) is configured to read from the table "dkim" in the database "modoboa". We tried to find an entry with and "id" of 1, and could not find one.

I'd go one further, but can't suggest something because I don't know the answer to "Why is it that we're looking for id 1 to begin with? This is a mystery to me to resolve, how opendkim knows to look for a KeyTable entry with keycol of 1 to begin with. This becomes relevant in the second message that confused me.

The second puzzling error I got was after I created a domain in that table (I'm using modoboa and added a domain which in effect added a row to that table. The row by default though have values for the keycol (id) and "domain_name" and "selector" but was blank in private_key_path. This now produced two errors from opendkim:

KeyTable entry for '1' corrupt D2F121520732: error loading key '1'

it produced these when testing with swaks, postfix being configured as a milter with:

smtpd_milters = inet:127.0.0.1:12345
non_smtpd_milters = inet:127.0.0.1:12345

and so when testing SMTP with swaks is when I was seeing these errors and scratching my head.

In this case, a better and time saving piece of feedback would have been:

The KeyTable (setting|property|configuration|) is configured to read from the table "dkim" in the database "modoboa". We tried to find an entry with and "id" of 1, and found one, but the value of the third expected field returned (private_key_path) did not provide a valid path to a private key (.pem file) that opendkim needs.

The third puzzling error message I encountered puzzled me on two levels, but first the message I got:

2: can't load key from /etc/mail/DKIMkeys/mydomain2.tld.pem: Permission denied

The first mystery this message posed was that /etc/mail/DKIMkeys contained two .pem files as I now had two domains defined in modoboa, and the two .pem files had exactly the same permissions, but somehow the error only arose for the second one, not the first. That perplexed me. But it might relate to the domains I was testing with swaks and the order of these tests and order in which the permission and ownerships were set.

In this space I was deviating (hand on heart) from the documented standards in a sense, in that I elected, rightly or wrongly to store the keys in /etc/mail (only because of existing precedences on that system with ssl keys stored under the webserver config in /etc and other system related keys). But right or wrong, the puzzle I faced was finding ownerships and permissions that worked, It took some while navigating also cryptic errors like:

key data is not secure: opendkim is in group 8 which has multiple users (e.g., "mail").

In short, this and the previous message could be much improved with clear requirements (or rules broken):. Not least because the second suggest a bug in opendkim. To explain that conclusion:

I had the keys set yo with "640 root:mail" permissions. And I did have two users in "mail". So I went for slightly more orthodox and tried "640 root:opendkim" but got self same message (a reference to group 8 which is called "mail" but not the group "opendikim" which has only one user in it.

I resolved this by going by the book and "600 opendkim:opendkim" permissions. And all came good. The time it cost me was not small, alas (also no gargantuan, just bothersome). And if the errors had read something more like:

The key data file (/etc/mail/DKIMkeys/mydomain2.tld.pem) cannot be read (by user opendkim or group openkim).

and

The key data file (/etc/mail/DKIMkeys/mydomain2.tld.pem) is not secure. It can be read by 2 users in the group "mail", and while the RequireSafeKeys setting is true (the default value), only one is permitted.

But this last one also suggested a bug/puzzling feature. I tried setting it to true, and the observed effect was not that I lost the error in the logs but when using systemd's status check the error remain present but interestingly while rendered orange when RequireSafeKeys was not mentioned in the config hence true by default and then when I added the config to set it false and restarted openkim the message was now rendered yellow. I had hoped this setting made the message go away.

Anyhow, I do understand that proposals for messages are rather rich, but this comes I admit from a history of both writing such code (and error messages) and QA applied to software with messages and such, and I do know both the work involved in assembling informative errors (and hence fully get and indulge, myself, in shortcutting that) but also the immense value of them.

All I can offer is a noob's encounter and the cost that cryptic messages posed. That could be saved with lucid descriptive messages. The general formula (not meaning to preach, but that I've encountered and adopted and evolved over time) for lucid errors, is essentially to paraphrase an assertion that you might write to preempt the error, basically describing in natural language what was expected and what was (or was not) found to hold true always to the level of precision and detail that empowers a reader to fix the broken assertion.

almereyda commented 2 months ago

I've seen the same with Modoboa today, but the field with the path to the file was just empty in the affected entry.