Open rvalles opened 1 year ago
Hello @rvalles,
Thanks for the bug report. I don't have a PC with a working NIC at the moment, so I can't actually test this. There had been significant testing towards the end of last year, when the NE2K was last worked on.
The fact that the detected MAC address is wrong seems to be a big flag, since this is read early in the NE2K driver startup and hasn't changed in some time. What is the correct MAC address? I'm wondering if any portions of the shown MAC coincide, giving me something to go on as to why it's incorrect. Perhaps there is an issue with 8- or 16-bit bus address with your NIC and the driver.
My guess is that your NE2K NIC and the driver aren't working together at all, and when ktcp
starts up and sends its first ARP request, the system may be hanging in the NIC driver. To check whether the system is in fact crashed, or just your shell/ktcp is hung, attempting to login to the console would let us know that. Are you running the v0.7.0 image from the recent release, or rebuilding the current development master?
Thank you!
Are you running the v0.7.0 image from the recent release,
Recent release. I may look into building it.
What is the correct MAC address?
vs
Note how 00:80:c8:--:--:--
becomes 00:--:80:--:c8:--
.
Note how 00:80:c8:--:--:-- becomes 00:--:80:--:c8:--.
That's exactly what I was looking for, thank you! So what could be happening is that the driver is using word reads instead of byte reads to get the data, and 35
is being read in the lower 8 bits, as indicated in the MAC address obtained. Let me look a bit into the source and see what might be the culprit here. If you'd like to look, the code I'll be inspecting is elks/arch/i86/drivers/net/ne2k.c::ne2k_drv_init(). I am noticing the following at the top of that function:
void ne2k_drv_init(void)
{
int err, i;
word_t prom[16]; /* PROM containing HW MAC address and more
* (aka SAPROM, Station Address PROM).
* PROM size is 16 bytes. If read in word (16 bit) mode,
* the upper byte is either a copy of the lower or 00.
* depending on the chip. This may be used to detect 8 vs 16
* bit cards automagically.
* The address occupies the first 6 bytes, followed by a 'card signature'.
* If bytes 14 & 15 == 0x57, this is a ne2k clone.
*/
It appears there may be an issue with reading from your NIC using 16-bit vs 8-bit moves, depending on the chip. What is your exact NE2k card and do you have a chip number? We might need that. We could compare with the packet driver source as well, but that's probably a lot of work.
The function goes on to read the MAC address:
ne2k_get_hw_addr(prom);
Which then goes into an ASM routine that reads the NIC differently based on whether it thinks the card is 8- or 16-bit. Did you try forcing the driver to run in both these modes? I'm not that familiar with the exact ne0=
flags line from /bootopts but the documentation on the Wiki should explain how to force both. That might make a difference, since the driver isn't detecting the card correctly, as it's indicating a NE1K (which is an 8-bit NIC, I believe).
Built master with a few changes:
----------------------- elks/arch/i86/drivers/net/ne2k.c -----------------------
index d3296bd3..3839b083 100644
@@ -468,8 +468,8 @@ void ne2k_drv_init(void)
ne2k_get_hw_addr(prom);
err = j = k = 0;
- //for (i = 0; i < 32; i++) printk("%02x", cprom[i]);
- //printk("\n");
+ for (i = 0; i < 32; i++) printk("%02x", cprom[i]);
+ printk("\n");
/* if the high byte of every word is 0, this is a 16 bit card
* if the high byte = low byte in every word, this is probably QEMU */
@@ -478,7 +478,7 @@ void ne2k_drv_init(void)
/* ne2k_flags may be used as a simple variable until
* we add in the buffer flags below */
- if (j && (j!=k)) {
+ if (j && (j!=k) && !(net_flagsÐF_16BIT_BUS)) {
ne2k_flags = ETHF_8BIT_BUS;
model_name[2] = '1';
netif_stat.if_status |= NETIF_AUTO_8BIT;
@@ -496,6 +496,9 @@ void ne2k_drv_init(void)
i = 1;
while (i < 6) printk(":%02x", mac_addr[i++]);
+ if (net_flagsÐF_16BIT_BUS) {
+ printk(" (16bit)"); // Forced 16bit
+ }
if (net_flagsÐF_8BIT_BUS) {
/* flag that we're forcing 8 bit bus on 16 NIC */
if (!ne2k_flags) printk(" (8bit)");
Boots like this:
Direct console, scan kbd 80x25 emulating ANSI (3 virtual consoles)
ttyS0 at 0x3f8, irq 4 is a 16450
ttyS1 at 0x2f8, irq 3 is a 16450
64 ext buffers, 65536 ram
eth: ne0 at 0x300, irq 1100358035c8352235df22696944002d004c28695a6e006b06208420a8570d5713
, (ne2k) MAC 00:80:c8:22:df:69 (16bit) (4k buffer), flags 0xa4
bioshd: hda BIOS CHS 1024,255,63
bioshd: hda IDE CHS 16383,16,63
/dev/hda: 16383 cylinders, 16 heads, 63 sectors = 8063.5 Mb
/dev/fd0: 80 cylinders, 2 heads, 18 sectors = 1440.0 kb
Partitions: hda:(0,16514064) hda1:(63,8401932)
device_setup: BIOS drive 0x0, root device 0x380
PC/AT class machine, syscaps 0xff, 638K base ram.
ELKS kernel 0.7.0 (55776 text, 12128 ftext, 8048 data, 42336 bss, 15150 heap)
Kernel text 2d0:0, ftext 106e:0, init 1230:0, data 1364:0, top 9f80:0, 501K free
fd: /dev/fd0 ELKS bootable, has 80 cylinders, 2 heads, and 18 sectors
MINIX-fs: mounting unchecked file system 0x380, running fsck is recommended.
VFS: Mounted root 0x0380 (minix filesystem).
Running /etc/rc.sys script
Sat Aug 12 11:47:50 2023
ELKS 0.7.0
login: root
elks86# net show
ip 192.168.1.186 gateway 192.168.1.1 mask 255.255.255.0 ne0
elks86# cat /etc/net.cfg
# ELKS Networking Configuration File
# sourced by /bin/net for ktcp and daemons
# Default IP address, gate and network mask.
# These can be IP addresses or names in /etc/hosts.
if test "$LOCALIP" != ""; then localip=$LOCALIP; else localip=10.0.2.15; fi
gateway=192.168.1.1
netmask=255.255.255.0
# reduce for 8bit NE2K w/4K buffer
mtu=
#mtu="-m 1000"
# default link layer [ne0|wd0|3c0|slip|cslip]
link=ne0
# default serial port and baud rate if slip/cslip
device=/dev/ttyS0
baud=38400
# to use ftp/ftpd in qemu
#export QEMU=1
# daemons to start are actually shell variable command lines see below
netstart="telnetd ftpd"
#netstart="ftpd"
# specific daemon command lines, named in netstart=
telnetd="telnetd"
ftpd="ftpd -d"
httpd="httpd"
# custom code executed before network startup
custom_prestart_network()
{
}
# custom code executed after network startup
custom_poststart_network()
{
#echo "Network started"
}
# custom code executed after network shutdown
custom_stop_network()
{
}
elks86# net start
Starting networking on ne0
ktcp -b -p ne0 192.168.1.186 192.168.1.1 255.255.255.0
It freezes all the same.
Tried 0xA0 too (leaving buffer size to the default). No dice.
But notice how the MAC is now correct.
This is the card, a D-Link DE-200TP+.
non-plus variant is documented here:
https://stason.org/TULARC/pc/network-cards/D/D-LINK-Ethernet-DE-200TP-for-PC-AT.html
According to this, IRQ is 11.
I/O is supposed to be 0x300 according to freedos, which would match this other similar card's I/O bank:
https://stason.org/TULARC/pc/network-cards/D/D-LINK-Ethernet-DE-205-TP.html
Boot rom (xtide universal bios) is at C8000H and enabled.
I added a call ne2k_reset
before call ne2k_base_init
in ne2k_get_hw_addr() within ne2k-asm.S.
This is as according to: https://wiki.osdev.org/Ne2000
A reset must be done before reading the prom.
Unfortunately, the build process takes hours, and build.sh did a clean, so it'll be a while till I can see whether that helped.
Hello @rvalles,
Thanks for your information, I'll take a look deeper at it.
Unfortunately, the build process takes hours, and build.sh did a clean, so it'll be a while till I can see whether that helped.
Geez, is running ./build.sh
a second time rebuilding the entire toolchain?! That will take forever!
There is a much better way, once the toolchain has been built: you can just use make
to rebuild ELKS, which takes a couple of minutes:
Rebuild ELKS (not toolchain) - takes 2-3 minutes:
make clean
make
Rebuild just the kernel (can be used by you for NE2K mods) - takes 30-40 seconds:
make kclean
make
A reset must be done before reading the prom.
Thank you for trying to debug the problem with your NE2K card!
Geez, is running ./build.sh a second time rebuilding the entire toolchain?! That will take forever!
The script seems clever enough to skip the toolchain build.
But the rest takes a very long time, for reasons unknown.
I will of course not be using the script again if I need to rebuild thereon.
Build finished just now. Over 2h.
Build finished just now. Over 2h.
It sounds like its rebuilding the toolchain. If not, and you can somehow capture the log and post it, I can see what is going on. In the meantime, just use make kclean; make
from here on out and we'll try to solve your NE2K quickly! :)
And I messed that up. Doing a time make kclean &>somelog now. It's extremely slow and I have no idea why. Zen+, 16GB RAM, otherwise builds far larger software much faster.
On my system (MacBook Pro 16G RAM):
time make kclean 3.5s
time make 24.5s
As far as I can tell, it's these dokcleans. Each takes a very long time, with low cpu usage.
rvalles 1954 0.0 0.0 10508 5556 pts/7 Ss Aug12 0:00 \_ -bash
rvalles 1065583 0.0 0.0 12716 3848 pts/7 S+ 02:19 0:00 | \_ make kclean
rvalles 1065584 0.2 0.0 12848 4168 pts/7 S+ 02:19 0:00 | \_ make -C elks kclean
rvalles 1075010 0.0 0.0 9108 3892 pts/7 S+ 02:20 0:00 | \_ /bin/sh -c for DIR in */ ; do \ ?if [ -f "$DIR/Makefile" -a "$DIR" != "tools" ]; then \ ??make -C "$DIR" dokclean ; \ ?fi ; \ done
rvalles 1075011 0.2 0.0 12804 4036 pts/7 S+ 02:20 0:01 | \_ make -C arch/ dokclean
rvalles 1086168 0.0 0.0 9108 3764 pts/7 S+ 02:21 0:00 | \_ /bin/sh -c for DIR in */ ; do \ ?if [ -f "$DIR/Makefile" -a "$DIR" != "tools" ]; then \ ??make -C "$DIR" dokclean ; \ ?fi ; \ done
rvalles 1086169 0.2 0.0 12924 3908 pts/7 S+ 02:21 0:01 | \_ make -C i86/ dokclean
rvalles 1097316 0.0 0.0 9112 3892 pts/7 S+ 02:21 0:00 | \_ /bin/sh -c for DIR in */ ; do \ ?if [ -f "$DIR/Makefile" -a "$DIR" != "tools" ]; then \ ??make -C "$DIR" dokclean ; \ ?fi ; \ done
rvalles 1197617 3.0 0.0 12800 4244 pts/7 S+ 02:27 0:00 | \_ make -C tools/ dokclean
As far as I can tell, it's these dokcleans. Each takes a very long time, with low cpu usage.
I do notice that that recursive dokclean does run "slow" on my system - about 1/10 second each? What version of make are you running? I'm not sure yet whether its the directory searching or make reinvocation that's taking time.
Are you saying that running make kclean
is taking hours?!?!! We definitely need to figure that out.
Good news is that you don't actually have to run make kclean
after modifying ne2k.c. You can just run make
and it will build correctly. make kclean
is only required sometimes after .h file modifications (which should be handled correctly but sometimes are not).
$ make --version
GNU Make 4.4.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Host is an up to date Arch Linux.
When gcc runs, that's of course very fast. But it takes tens of minutes to get there, possibly over an hour for a whole clean (I imagine kclean will be faster).
Good to know re: kclean only needed for header changes.
It is also possible that your shell wildcard expansion is taking forever...:
for DIR in */ ; do \
if [ -f "$DIR/Makefile" -a "$DIR" != "tools" ]; then \
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C "$DIR" dokclean ; \
If I'm not mistaken, the above will run a shell for the for
loop. It seems there is something funky going on with either fast shell or recursive make invocations...
It finished. It only took this much because kclean vs clean.
$ time make kclean &> slow_kclean.log
real 23m47.237s
user 16m26.652s
sys 13m56.799s
Now, to the actual building.
gcc does its job real fast. Each file compiles in an instant, but there's huge gap between files.
It's spending a very long time on else. Make is all I get to see on top's output, whereas gcc seldom is there.
It sounds like there is something wrong with your make
, /bin/sh
, or possibly glob expansion. ELKS is using an older, but Linux kernel standard, make system. I am investigating this a bit with a shell script that performs for
loops for you to try.
Try putting copying this to /tmp/rec.sh and then running /tmp/rec.sh
from the elks home dir:
#!/bin/sh
for DIR in */ ; do
if [ -f "$DIR/Makefile" -a "$DIR" != "tools" ]; then
echo "$DIR" ;
cd "$DIR" ;
/tmp/rec.sh ;
cd .. ;
fi ;
done
This uses the shell to glob and test directories. If this runs quickly, then I would say make -C
is the slowdown...
(...)
time/
tools/
real 0m1.365s
user 0m0.722s
sys 0m0.633s
This is slower than it should be, but not that slow.
Going to get some sleep. I'll tell you how long the actual build took once I get up.
As an added note before I sleep, dash takes 0.4s instead of 1.4s (done a few reruns for both, for consistency).
Dash really makes a difference, but it's still way slower than I'd like.
I might switch the sh link to dash (it's bash by default) at some point, after the build ends.
Get some sleep! When you wake up, consider:
This is slower than it should be, but not that slow.
1.4s to just wildcard expand a few directories and echo them? I would say very very slow, although even on my system this takes 0.5 seconds!
dash takes 0.4s instead of 1.4s
Definite improvement - is /bin/sh actually bash
then?
I might switch the sh link to dash (it's bash by default) at some point
I think setting SHELL=/bin/dash may direct make
to use dash rather than the default shell. However, this may need to be in the top-level Makefile and exported to work in all the make -C
invocations.
$ time make &> onemake.log
real 95m11.432s
user 65m38.587s
sys 55m45.458s
[11:07:53.558] tio v2.6
[11:07:53.558] Press ctrl-t q to quit
[11:07:53.574] Connected
Direct console, scan kbd 80x25 emulating ANSI (3 virtual consoles)
ttyS0 at 0x3f8, irq 4 is a 16450
ttyS1 at 0x2f8, irq 3 is a 16450
64 ext buffers, 65536 ram
eth: ne0 at 0x300, irq 11
ne2k_reset();
ne2k_get_hw_addr(prom);
Calling reset made it freeze, not sure if in the reset itself or when trying to get the prom contents.
Out of ideas, for now.
Hello @rvalles,
Do you happen to have another NE2K-style network card around? I'm pretty sure the networking works well, but thought that direction might be easier given how long it takes to compile the system with your possible shell/make issue.
Thank you!
I believe I have another 486 (DX/33 I think?) in my old stuff pile. It definitely has a NIC, although I do not know which. I can try the release there, and should it work, check what happens if I swap cards.
I will get to it once I get the chance. Might be later today or tomorrow.
re: slow build, I could prepare a docker build environment, and see what happens there. If it is slow, anyone will be able to reproduce it with the Dockerfile.
... that 486 has no cards installed. But a 386 had this NIC:
Some info about this one can be found here:
http://en.techinfodepot.shoutwiki.com/wiki/D-Link_DE-220P_rev_D2
Now cleaning the contacts with an eraser and alcohol. This one seems pnp, and got no jumpers. I recall there was a dos tool to set it up (set IRQs and such). I will try and see if I can make it work on DOS first.
I will not install the boot rom for now, but instead will boot dos via my optromloader floppy (w/xtide universal bios).
It seems to work with this NIC.
This is using the downloaded ELKS 0.7.0 1440K image, only edited config files to set up IRQ and ip addresses.
$ telnet 192.168.1.186
Trying 192.168.1.186...
Connected to 192.168.1.186.
Escape character is '^]'.
login: root
# uname -a
ELKS elks 0.7.0 commit d043b92d 03 Aug 2023 07:39:59 -0700 ibmpc i8086
# grep ne0 /bootopts
#console=ttyS0,57600 debug net=ne0 3 # sercons, multiuser, networking
ne0=10,0x300,,0x80
Of course, the other NIC works with freedos, so we still need to figure out why it doesn't with ELKS.
Enabled full duplex and boot rom, and moved the boot rom to it.
Even with the boot rom (XTIDE Universal BIOS), network still works on ELKS.
Of course, the other NIC works with freedos, so we still need to figure out why it doesn't with ELKS.
Thanks for the testing, and glad to have a working ELKS comparative test case NE2K NIC. I will need to read up on the differences between the cards based on the links you supplied. Do you happen to know whether the cards differ by chip type or ISA card addressing (word/byte) etc?
Both are 16bit ISA as stated, but the problem one gets detected as 8-bit by the early check in ne2k_drv_init(), which causes the mac address to be misinterpreted.
They have different chips as pictured, I'd assume both re-labeled from different manufacturers.
The next thing I'll do when I have some time to work on this will be to make a dockerfile for the build environment, using Arch Linux. If it's slow, I can then easily check whether it is also slow on another machine, and also make one with Debian to see if that is slow.
On ELKS itself, I'm interested in dumping the eeprom of the good card too, then perhaps try to manually parse it referencing ELKS and somebody else's ne2000 code.
Looking into it again. Discovered there's a Dockerfile already.
Note said Dockerfile needs to be modified to install the patch
package.
Do you really need Docker to build? What about the option of building using export SHELL=/bin/dash
, doesn't that speed up the build to normal? Or is that still quite slow?
I wonder what Arch Linux did that's causing /bin/bash to run so very slowly...
Docker image builds fast.
real 3m35.158s user 2m31.780s sys 1m6.723s
(from a clean tree)
I should try and make an arch-based equivalent Dockerfile to see if we can reproduce the slowness. But not right now, for this ticket.
ATM writing a floppy. I want to try and dump the eeprom values on the working ne2000, for comparison purposes.
SAPROM reads:
00008080c8c8f0f03232dcdc0000000000000000000000000000000057575757
The other NIC (the problem one) did read:
00358035c8352235df22696944002d004c28695a6e006b06208420a8570d5713
The other NIC (the problem one) did read: 00358035c8352235df22696944002d004c28695a6e006b06208420a8570d5713
There's a lot of repeating 35
s in the first part of that line... I'm wondering if this is more than just a word vs byte read, perhaps a delay is needed for the bad NIC? Can the two lines be displayed by any code running outside ELKS for comparison?
There's a crynwr ne2k driver that we know works with the card.
I'll see if I can figure out how to compile it, when I have some time. It should not be too hard to make it print the SAPROM.
I have been experimenting with several NE2000 cards running in 8 bit mode. I have a DE220P that I could not get working for any reason. Recently, I bought a Kingston KNET-20T off of ebay from 1996 that was advertised as being 100% ne2000 compatible. I installed it on my 80386 to set with IRQ=3 Address=300h. Tested it with MSDOS and mTCP. Installed it into ELKS .70, set my bootops to net=ne0, ne0=3,0x300,0x81 and it came up flawlessly the first time. I was able to FTP to my FTP Server and Telnet locally. I did assign localip, nameserver and gateway manually. From what I can see, not all NE2000 are created equally. Let me know if I can do some testing. My system is a Leading Edge Model D at 4.77mhz, 640K, 8 Bit CF Card adapter with XTIDE and ISA VGA Card. Thanks for making ELKS a reality. Geoff.
Hello @gepoolejr,
Thanks for the testing report on your NE2K NIC! I have to agree, it seems various NE2K are not compatible, but I am glad to hear that many are with ELKS.
My system is a Leading Edge Model D at 4.77mhz,
Wow, a Leading Edge Model D. I had one of those and used it as my primary development system long ago. I really liked that machine. I seem to recall it ran quickly although I'm not sure how if they ran at 4.77mhz.
Thanks for the testing report on your NE2K NIC! I have to agree, it seems various NE2K are not compatible, but I am glad to hear that many are with ELKS.
Happy to do it. I also tried my newer WD8013 and couldn't get it working either. It seems that legacy hardware in the pre2000 era works best. I haven't tried to install an XTIDE chip on the card yet, but that is on my testing schedule. One of these days, I will start looking at the 8390 - I have several 3C503 cards (8 and 16) and they are pretty well documented.
Wow, a Leading Edge Model D. I had one of those and used it as my primary development system long ago. I really liked that machine. I seem to recall it ran quickly although I'm not sure how if they ran at 4.77mhz.
It runs pretty good overall. It is my fav 8088 system also. I dropped in an NEC V20 which speeds it up a bit, but the 80186 instruction set makes it more useful. I originally used it to load Minix 1.5 because I really wanted to do things in smaller spaces. I also used it demo Minix 1.5 at VCFNW in 2019 before Coivd. Had it almost self booting the root disk but I couldn't quite get the bootsec file setup correctly from the Minix Usenet archives. Fortunately, ELKS has made a lot of headway. Plan to get a cross development system setup soon.
I have a DE220P that I could not get working for any reason.
Notice the one that works on my end is the DE220P (!), whereas a DE200 doesn't.
See above for pictures of the cards, and the config I used with the DE220P.
Notice the one that works on my end is the DE220P (!), whereas a DE200 doesn't.
See above for pictures of the cards, and the config I used with the DE220P.
I'm sure it works on a 16 bit ISA slot. I'm testing on an 8 bit slot with ne0=3,0x300,,0x81. That configuration doesn't work - not sure why. The only card I've used so far as a NE2000 in 8 bit mode is the Kingston KNET-20T.
Description
Configuration
How to reproduce ?
Raw data Enabled serial console, producing the log below. Direct console, scan kbd 80x25 emulating ANSI (3 virtual consoles) ttyS0 at 0x3f8, irq 4 is a 16450 ttyS1 at 0x2f8, irq 3 is a 16450 64 ext buffers, 65536 ram eth: ne0 at 0x300, irq 11, (ne1k) MAC 00:35:80:35:c8:35 (16k buffer), flags 0xa2 eth: wd0 at 0x240, irq 2, ram 0xce00 not found eth: 3c0 at 0x330, irq 11 not found bioshd: hda BIOS CHS 1024,255,63 bioshd: hda IDE CHS 16383,16,63 /dev/hda: 16383 cylinders, 16 heads, 63 sectors = 8063.5 Mb /dev/fd0: 80 cylinders, 2 heads, 18 sectors = 1440.0 kb Partitions: hda:(0,16514064) hda1:(63,8401932) device_setup: BIOS drive 0x0, root device 0x380 PC/AT class machine, syscaps 0xff, 638K base ram. ELKS kernel 0.7.0 (59312 text, 12128 ftext, 8528 data, 42624 bss, 14382 heap) Kernel text at 2d0:0000, ftext 114b:0000, data 1441:0000, top 9f80:0, 492K free fd: /dev/fd0 ELKS bootable, has 80 cylinders, 2 heads, and 18 sectors MINIX-fs: mounting unchecked file system 0x380, running fsck is recommended. VFS: Mounted root 0x0380 (minix filesystem). Running /etc/rc.sys script Fri Aug 11 17:19:43 2023
ELKS 0.7.0
login: root elks86# net show ip 192.168.1.186 gateway 192.168.1.1 mask 255.255.255.0 ne0 elks86# net start Starting networking on ne0 ktcp -b -p ne0 192.168.1.186 192.168.1.1 255.255.255.0 (frozen)
Additional information