JohnForecast / LinuxDECnet

DECnet as an external module for Linux
9 stars 0 forks source link

VMSmail hangs or garbles messages #2

Closed jayjwa closed 10 months ago

jayjwa commented 11 months ago

x86_64 AMD Ryzen 5, Linux Slackware (current), 6.5.5 (kernel.org) Sendmail 8.17.2

Network layout:

Node                     HWtype  HWaddress           Flags  MTU        Iface
ATR2                     loop    AA:00:04:00:01:04   ---    65533      lo
NERGIG                   ether   AA:00:04:00:0F:04   ---    1498       br0
LUNAST                   ether   AA:00:04:00:0E:04   ---    1498       br0
TEOSTR                   ether   AA:00:04:00:0D:04   ---    1498       br0
KUSHAL                   ether   AA:00:04:00:0C:04   ---    1498       br0
KIRIN                    ?       AA:00:04:00:0B:04   ---    0          0001498

OpenVMS 7.3 on [lunast]ra:

jayjwa@atr2 /u/s/L/d/mail> rsh -4 lunast show net

Product:  DECNET        Node:  LUNAST               Address(es):  1.14
Product:  TCP/IP        Node:  lunast.lan           Address(es):  192.168.20.3

Sending from VMS sometimes hangs, forcing ctrl-Y exit.

$ mail

MAIL> send
To: atr2::jayjwa
Subj:   Test of mail from VMS to Linux
Enter your message below. Press CTRL/Z when complete, or CTRL/C to quit:
As the subject states...
 Exit 
 Cancel 

 Interrupt

A connection shows but nothing in the maillog. Debug info shows dnetd sees the connection.

Active DECnet sockets (servers and established)
Proto  Local Address            Foreign Address          Dir State         Accept mode
 decnet *::PHONE                 *::*                     IN  LISTEN        IMMED
decnet ATR2::FAL                KUSHAL::SYSTEM               CLOSED        
decnet ATR2::MAIL               LUNAST::JAYJWA               ESTABLISHED   
decnet ATR2::FAL                KIRIN::SYSTEM                CLOSED        
decnet ATR2::FAL                KIRIN::SYSTEM                CLOSED

2023-10-29T20:51:10.699694-04:00 atr2 dnetd[21015]: Connection from: LUNAST::jayjwa
2023-10-29T20:51:10.703538-04:00 atr2 vmsmaild[7104]: got local user: JAYJWA

From Linux to VMS (.com added to hostname to defeat spambots), mail never arrives:

mailx -s "Test of mail from Linux to OpenVMS" 'lunast::jayjwa <vmsmail>'
To: "lunast::jayjwa" <vmsmail@atr2.ath.cx.com>
Subject: Test of mail from Linux to OpenVMS

Testing from Linux to VMS...
^D
-------
(Preliminary) Envelope contains:
To: "lunast::jayjwa" <vmsmail@atr2.ath.cx.com>
Subject: Test of mail from Linux to OpenVMS
Send this message [yes/no, empty: recompose]? yes

$ mail

MAIL> dir
%MAIL-E-NOTEXIST, folder MAIL does not exist

MAIL> 
LUNAST::JAYJWA 21:02:23 MAIL      CPU=00:00:01.14 PF=11124 IO=165 MEM=761
MAIL>

What Sendmail thinks about it:

2023-10-29T21:01:23.162164-04:00 atr2 sendmail[7614]: 39U11Nis007614: from=jayjwa, size=182, class=0, nrcpts=1, msgid=<202310300101.39U11Nis007614@atr2.ath.cx.com>, relay=jayjwa@localhost
2023-10-29T21:01:23.227555-04:00 atr2 sm-mta[7616]: NOQUEUE: connect from jayjwa@localhost [127.0.0.1]
2023-10-29T21:01:23.228467-04:00 atr2 sm-mta[7616]: 39U11Nhu007616: Milter (milter-regex): init success to negotiate
2023-10-29T21:01:23.228471-04:00 atr2 sm-mta[7616]: 39U11Nhu007616: Milter: connect to filters
2023-10-29T21:01:23.320282-04:00 atr2 sm-mta[7616]: tls_srv_features=(null), relay=localhost [127.0.0.1]
2023-10-29T21:01:23.320287-04:00 atr2 sm-mta[7616]: tls_srv_features=empty, stat=0, relay=localhost [127.0.0.1]
2023-10-29T21:01:23.339944-04:00 atr2 sendmail[7614]: STARTTLS=client, relay=[127.0.0.1], version=TLSv1.3, verify=OK, cipher=TLS_AES_256_GCM_SHA384, bits=256/256
2023-10-29T21:01:23.340925-04:00 atr2 sm-mta[7616]: STARTTLS=server, relay=jayjwa@localhost [127.0.0.1], version=TLSv1.3, verify=OK, cipher=TLS_AES_256_GCM_SHA384, bits=256/256
2023-10-29T21:01:23.427944-04:00 atr2 sm-mta[7616]: ruleset=trust_auth, arg1=jayjwa@atr2.ath.cx.com, relay=jayjwa@localhost [127.0.0.1], reject=550 5.7.1 <jayjwa@atr2.ath.cx.com>... not authenticated
2023-10-29T21:01:23.504374-04:00 atr2 sm-mta[7616]: 39U11Nhv007616: from=<jayjwa@atr2.ath.cx.com>, size=419, class=0, nrcpts=1, msgid=<202310300101.39U11Nis007614@atr2.ath.cx.com>, proto=ESMTPS, daemon=MTA, relay=jayjwa@localhost [127.0.0.1]
2023-10-29T21:01:23.504398-04:00 atr2 sm-mta[7616]: 39U11Nhv007616: Milter accept: message
2023-10-29T21:01:23.539542-04:00 atr2 sendmail[7614]: 39U11Nis007614: to=vmsmail@atr2.ath.cx.com, ctladdr=jayjwa (1000/1000), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30182, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (39U11Nhv007616 Message accepted for delivery)
2023-10-29T21:01:23.773353-04:00 atr2 sm-mta[7619]: 39U11Nhv007616: to=|/usr/local/sbin/sendvmsmail, ctladdr=<vmsmail@atr2.ath.cx.com> (197/12), delay=00:00:00, xdelay=00:00:00, mailer=prog, pri=30694, dsn=2.0.0, stat=Sent
2023-10-29T21:01:23.814437-04:00 atr2 sm-mta[7619]: 39U11Nhv007616: done; delay=00:00:00, ntries=1

I don't think the "not authenticated" bit matters. I can send mail from here to a Solaris host and from there back.

jayjwa@atr2 ~> mailx -s "Test of mail to Solaris" jayjwa@kulve.lan
To: jayjwa@kulve.lan
Subject: Test of mail to Solaris

To show that mail is working...
^D
-------
(Preliminary) Envelope contains:
To: jayjwa@kulve.lan
Subject: Test of mail to Solaris
Send this message [yes/no, empty: recompose]? yes

Trying 192.168.20.5...
Connected to kulve.lan.
Escape character is '^]'.

SunOS 5.9

login: jayjwa
Password: 
Last login: Sat Oct 28 21:10:26 from atr2:2
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
You have new mail.
[21:25 jayjwa@kulve:~ >] mailx                                       [pts/5 hst:1]
mailx version 5.0 Sat Apr  6 14:57:29 PST 2002  Type ? for help.
"/var/mail/jayjwa": 1 message 1 new
>N  1 jayjwa             Sun Oct 29 21:24   22/909   Test of mail to Solaris
? 

In other words, SMTP mail is working. Sometimes the DEC mail arrives with headers only but no body. Here is one with no body (also from OpenVMS 7.3 but another node):

jayjwa@atr2 ~> mailx
mailx version v14.9.24.  Type `?' for help
/var/spool/mail/jayjwa: 2 messages 1 new
 O  1 Mail System Intern 2023-10-29 19:17   13/536   DON'T DELETE THIS MESSAGE -
▸N  2 KIRIN::USER   2023-10-29 21:10   20/758   I hope this one gets thru..
? 2
[-- Message  2 -- 20 lines, 758 bytes --]:
Date: Sun, 29 Oct 2023 21:10:39 -0400
Message-Id: <202310300110.39U1AdIO008146@atr2.ath.cx.com>
From: vmsmail@atr2.ath.cx.com (KIRIN::USER )
Subject: I hope this one gets thru...
To: jayjwa@atr2.ath.cx.com
Cc: another.test.of.mail@atr2.ath.cx.com

--
?

Notice the subject landed in the CC, and the body landed in the Subject. The "forward" file is set up as per instructions.

cat /var/spool/vmsmail/.forward 
  |/usr/local/sbin/sendvmsmail
grep vmsmail /etc/passwd
  vmsmail:x:197:12:VMS/DECNET mail gateway user:/var/spool/vmsmail:/bin/ksh

There's no vmsmail.conf (not needed) and dnetd.conf is mostly default except for the Mirror object.

JohnForecast commented 11 months ago

Thanks for the problem report. Can you try the following to see if we can get some more informations on the problems:

  1. Add -ddd to the "MAIL" line in /etc/dnetd.conf: MAIL 27 N,N vmsmail vmsmaild -ddd

    This should generate some additional logging in /var/log/syslog or /var/log/messages

  2. Grab a tcpdump of a small transfer in each direction with something like: tcpdump -i br0 -w file.dmp decnet host aa.nn

    where aa.nn is the DECnet node address of your VMS system

I notice that you are using Slackware. Were there any changes necessary to the install script?

jayjwa commented 11 months ago

I didn't exactly use the install script, but rather cloned the repo, built the kmod, installed it, then the programs. Slackware doesn't have systemd. Since I already had a bridge (for Pydecnet) with tuntaps I used that. Pydecnet doesn't play nice with Linux DECNET (dnlogin and sethost won't work) so I shut it down. bridge.c (or the Italian modification with LAT/LAST/LAD) will work. If you already have a bridge, you don't have to change the lladdr of your external interface. Then you can start DECNET whenever you want, even after TCPIP, and it won't mess with IP6 addresses.

Today, with "MAIL 27 N,N vmsmail vmsmaild -ddd", I can get mail from VMS to Linux. Before I had "MAIL 27 N,Y vmsmail vmsmaild". So half the problem is solved I think?

O  3 LUNAST::SYSTEM     2023-10-31 12:20   20/710   Test from VMS to Linux        
▸N  4 KIRIN::SYSTEM      2023-10-31 13:09   19/719   Another test from VMS to Linux
? 4
[-- Message  4 -- 19 lines, 719 bytes --]:
Date: Tue, 31 Oct 2023 13:09:12 -0400
Message-Id: <202310311709.39VH9CR3022535@atr2.ath.cx.com>
From: vmsmail@atr2.ath.cx.com (KIRIN::SYSTEM )
Subject: Another test from VMS to Linux.
To: jayjwa@atr2.ath.cx.com
Cc: 

This is node KIRIN, hello. 

?

Linux to VMS still won't work. Packet capture is mail both directions. vmsmail-debug.pcap.gz

Sendmail's logs are the same as before, so I won't include them.

syslog:

2023-10-31T12:19:56.781466-04:00 atr2 sendvmsmail[19880]: running as 197
2023-10-31T12:19:56.781497-04:00 atr2 sendvmsmail[19880]: Sending message from jayjwa <jayjwa@atr2.ath.cx.com> to lunast::jayjwa
2023-10-31T12:19:56.781728-04:00 atr2 sendvmsmail[19880]: Local user: vmsmail
2023-10-31T12:19:56.980870-04:00 atr2 sendvmsmail[19880]: sendvmsmail: mail_error: jayjwa <jayjwa@atr2.ath.cx.com>; lunast::jayjwa; Test of mail: Linux to OpenVMS; Cannot connect to VMS system: Connection refused#012
2023-10-31T12:20:23.923114-04:00 atr2 dnetd[19520]: using user vmsmail from dnetd.conf
2023-10-31T12:20:42.171436-04:00 atr2 vmsmaild[19901]: Forwarding mail from LUNAST::SYSTEM       to jayjwa (n=1)
2023-10-31T12:20:42.259397-04:00 atr2 oidentd[19917]: Connection from localhost (127.0.0.1):34267
2023-10-31T12:20:42.259578-04:00 atr2 oidentd[19917]: [localhost] Successful lookup: 51470 , 25 : vmsmail (vmsmail)
2023-10-31T12:20:42.261401-04:00 atr2 milter-regex[1337]: localhost [127.0.0.1]: ACCEPT, HELO: atr2.ath.cx.com, FROM: , RCPT: , From: , To: , Subject:
2023-10-31T12:20:42.572855-04:00 atr2 vmsmaild[19901]: Forwarded mail? 0
2023-10-31T12:20:42.573142-04:00 atr2 dnetd[19520]: Reaped child process 19901
2023-10-31T12:24:40.587392-04:00 atr2 dnetd[19520]: using user vmsmail from dnetd.conf

syslog (debug level):

2023-10-31T12:20:23.923104-04:00 atr2 dnetd[19520]: Connection from: LUNAST::system
2023-10-31T12:20:23.940993-04:00 atr2 vmsmaild[19901]: got local user: JAYJWA
2023-10-31T12:20:42.171808-04:00 atr2 vmsmaild[19901]: body: You guessed it. 15
2023-10-31T12:20:42.172052-04:00 atr2 vmsmaild[19901]: body:  1

At this point, vmsmaild uses 100% (or near) CPU. Maybe hung?: 20101 vmsmail 20 0 3788 1984 1856 S 99.7 0.0 9:31.55 vmsmaild

An OPCOM message is on VMS side:

$
%%%%%%%%%%%  OPCOM  31-OCT-2023 11:57:36.00  %%%%%%%%%%%
Message from user AUDIT$SERVER on LUNAST
Security alarm (SECURITY) and security audit (SECURITY) on LUNAST, system id: 1038
Auditable event:          Network login failure
Event time:               31-OCT-2023 11:57:35.97
PID:                      00000236
Process name:             MAIL_8227
Username:                 DECNET
Remote nodename:          ATR2
Remote node id:           1025 (1.1)
Remote username:          Linux0000
Status:                   %LOGIN-F-NOSUCHUSER, no such user

What $ accounting thinks about it: 31-OCT-2023 11:57:36 LOGFAIL DECNET 00000236 ATR2 00D380F4

There certainly is a DECNET user. The setup is just the basic install with nothing fancy or changed.

DECNET DEFAULT       DECNET          [376,376]    DECNET   Normal  4 SYS$SPECIFIC:[DECNET]
                     DEFAULT         [200,200]             Normal  4 Disuser
FAL$SERVER DEFAULT   FAL$SERVER      [376,373]    DECNET   Normal  4 SYS$SPECIFIC:[FAL$SERVER]
FIELD SERVICE        FIELD             [1,10]     FIELD    All     4 Disuser
jayjwa               JAYJWA          [200,202]             Object  4 DUA1:[JAYJWA]
MAIL$SERVER DEFAULT  MAIL$SERVER     [376,374]    DECNET   Normal  4 SYS$SPECIFIC:[MAIL$SERVER]
NML$SERVER DEFAULT   NML$SERVER      [376,371]    DECNET   Normal  4 SYS$SPECIFIC:[NML$SERVER]
PHONE$SERVER DEFAULT PHONE$SERVER    [376,372]    DECNET   Normal  4 SYS$SPECIFIC:[PHONE$SERVER]
SYSTEM MANAGER       SYSTEM            [1,4]      SYSTEM   All     4 SYS$SYSROOT:[SYSMGR]

Here's a working one from another node (VMS to VMS). Note the name is MAIL$SERVER. Maybe that's important. 31-OCT-2023 12:07:30 PROCESS NETWORK MAIL$SERVER 00000237 KUSHAL 10000004

Thanks for taking this up. I've wanted to see Linux speaking DECNET for years.

JohnForecast commented 11 months ago

I haven't used Slackware since one of the earliest releases where it had to be installed from floppies - just curious.

I don't think adding "-ddd" fixed anything. All that does is increase the logging level to syslog!

The mail error from your syslog corresponds to the OPCOM message, I'll take a look at the packet trace to see what it is trying to do. Are these systems accessible via HECnet? The node names don't seem to match. Can you capture the output of "ncp sho object mail characteristics" on the VMS system?

I'm working on setting up a VMS system here so I can test locally,

jayjwa commented 11 months ago

Yes, I remember Slackware floppies. 2003 or so was when I got onboard. I detoured to LFS and then came back a few years ago. These nodes are not on HECnet because I don't want to change all the nodes and names (the one I did change at one time was a nightmare).

$ 
LUNAST::JAYJWA 17:39:21   (DCL)   CPU=00:00:01.07 PF=10478 IO=178 MEM=190
$ ncp show object mail characteristics

Object Volatile Characteristics as of 31-OCT-2023 18:05:57

Object = MAIL

Number                   = 27
File id                  = MAIL_SERVER.EXE
User id                  = MAIL$SERVER
Proxy access             = outgoing
Alias outgoing           = Enabled

As system, there's a Password field set to a random value, from install time. I never had to give any passwords to mail VMS to VMS, whether basic user or system.

JohnForecast commented 11 months ago

Can you grab a tcpdump trace between 2 VMS systems?

Unfortunately, DEC did not publish some of the protocol specifications; dts/dtr, passthru and, of course, mail-11. The VMS utilities for Linux were reverse engineered sometime around the mid 1990's. It's possible, but unlikely, that the protocols have changed.

jayjwa commented 11 months ago

dnlogin from 1.1, watching 1.14 mail to 1.11, with reply. vmsmail-debug-vms2vms.pcap.gz

JohnForecast commented 10 months ago

The latest version of sendvmsmail.c should solve the problem of logging into VMS using the default non-priviledged account. This version version also includes initial support for the latest mail-11 transport proctocol, mail11v3. Eventually this version will support "Cc:" lines. Use the "-3" switch to sendvmsmail to use mail11v3 by default with a fallback to the older protocol if the connection fails.

jayjwa commented 10 months ago

This fixes the problem. Add | "/usr/local/sbin/sendvmsmail -3" to the forward file.

    #1          29-NOV-2023 22:27:50.07                                  NEWMAIL
From:   ATR2::"jayjwa <jayjwa@atr2.ath.cx.com>"
To: lunast::jayjwa
CC: 
Subj:   Test of mail: Linux to OpenVMS

Test

MAIL>