uweseimet / scsi2pi

Advanced performant SCSI/SASI emulation and tools for the PiSCSI/RaSCSI board
https://www.scsi2pi.net
BSD 3-Clause "New" or "Revised" License
10 stars 2 forks source link

DaynaPort: Improve the work-around for the MacOS driver, support NetBSD for the Mac #5

Closed uweseimet closed 11 months ago

uweseimet commented 11 months ago

This ticket results from https://github.com/PiSCSI/piscsi/issues/1098 in the PiSCSI project.

There are DaynaPort drivers for the Atari, MacOS and NetBSD on 68k Macs. The MacOS driver is broken and requires a delay when data (a network packet) are returned. Currently this delay is the default behavior, but the default should not be tailored to a broken drivers but to working drivers. The delay work-around actually breaks the NetBSD driver to work. Due to another peculiarity of the MacOS driver it is possible to detect whether this driver is the one sending a SCSI command: The driver is the only known driver that requests 17 bytes of INQUIRY data during its initialization. In order to make s2p work for all known drivers the delay work-around should only be applied when the MacOS driver is connected. In order to support multi-initiator systems s2p has to remember which initiator needs the work-around (MacOS) and which not (all others.)

uweseimet commented 11 months ago

@speakers-k64 As promised in https://github.com/PiSCSI/piscsi/issues/1098 I am getting back to you now regarding NetBSD and the DaynaPort emulation. I re-created your ticket in the new SCSI2Pi project and will soon start working on it. I would appreciate your testing, because you are most likely the only one who has a suitable setup.

uweseimet commented 11 months ago

@speakers-k64 There is an issue_5 branch now that should resolve your issue. The send delay is set per initiator, based on the last INQUIRY received by that initiator. If you only have a slow Pi and compiling that branch takes long I can provide a pre-compiled Debian binary package for testing.

speakers-k64 commented 11 months ago

@uweseimet I've yet to build and test this branch, but from inspection I don't think it will work based on my previous tests and diffs. I believe the send delay needs to be set on the ENABLE following the INQUIRY. MacOS issues multiple inquiries and not all carry the telltale length of 37 - so this branch would erroneously result in no send delay for MacOS. But I will test this in the next few days.

uweseimet commented 11 months ago

@speakers-k64 Thank you for your feedback. Looks as if this work-around is becoming even uglier than I thought ;-). I was already afraied that there would be more than one INQUIRY command ... Anyway, I would like to resolve this, so that it works for you and also for any other environment. Note that there is already a (32 bit) binary build of the respective branch: https://www.scsi2pi.net/packages/develop/scsi2pi_1.1_devel_11a74ab_debug_armhf-1.deb. If you are using a 32 bit OS this build should work for you and you do not need to compile first. Further testing may require compiling anyway, though.

Whatever the final solution may look like, we must be missing something. Real DaynaPort hardware does not care for the OS, i.e. either there will always be a delay or never, and still it appears to work with both NetBSD and MacOS, doesn't it? So there must be a solution without any work-around.

speakers-k64 commented 11 months ago

I can build your issue_5 branch but since this is in the scsi2pi project it's not clear to me how to run it. The piscsi project uses the piscsi.service: how should scsi2pi be started? I use the web ui to control piscsi.

As to real daynaport hardware, I have no idea whether it can talk to NetBSD. I have a written a NetBSD driver myself for NetBSD 3. Another driver was released about a year ago for current NetBSD releases but I have yet to test this. Although modern NetBSD still runs on vintage Macs, it's painfully slow - hence my desire to run an earlier, simple release.

uweseimet commented 11 months ago

Just (e.g. temporarily) replace the piscsi binary with the s2p bianry and restart the service. Except for the improvements mentioned on the SCSI2Pi webpage it is fully compatible. In case you use the Debian packages I provide this is all done automatically. The downloads webpage (https://www.scsi2pi.net/en/downloads.html) has the details.

Regarding the NetBSD driver that has issues with PiSCSI/SCSI2Pi, is this your own driver? I had thought it was a driver that was part of NetBSD.

speakers-k64 commented 11 months ago

Just (e.g. temporarily) replace the piscsi binary with the s2p bianry and restart the service. Except for the improvements mentioned on the SCSI2Pi webpage it is fully compatible.

Ah. It wasn't clear that a rename and replace was required.

In case you use the Debian packages I provide this is all done automatically. The downloads webpage (https://www.scsi2pi.net/en/downloads.html) has the details.

I'm not using Debian packages.

Regarding the NetBSD driver that has issues with PiSCSI/SCSI2Pi, is this your own driver? I had thought it was a driver that was part of NetBSD.

There is no builtin NetBSD driver. I'm using one I developed myself. However, another was contributed in Dec 2022 - "dse". Both this driver and my own are based on the older Cabletron EA41x scsi ethernet adaptor driver - "se" - which has been present for a long time.

uweseimet commented 11 months ago

I see, and all of these three NetBSD drivers are failing with this delay? Even though the delay is mysterious it does not violate the SCSI specificiation, because the data transfer is controlled by the REQ/ACK handshake and not by any timing. I wonder whether there are old Macs that do not have a SCSI controller? With a SCSI controller the handshake can hardly cause a problem.

From that perspective any driver should work despite this delay. But the MacOS driver requires it (at least with slow Macs), which was confirmed recently. I assume that this driver is an "official" driver provided by the DaynaPort vendors. This would indicate that there is also a delay with a real DaynaPort device, which The Atari drivers (there are two, both have the same codebase) work with and without a delay. I know how the handshake works in this case, it is definitely SCSI compliant. The NetBSD drivers only work without the delay if I understand you correctly. This might indicate that something is wrong with their handshake. Have you verified this in your driver's code?

This is all rather strange. Anyway, I'm waiting for your feedback on the issue_5 branch. You are the only one who can test it ;-).

speakers-k64 commented 11 months ago

I see, and all of these three NetBSD drivers are failing with this delay?

Let me clarify. There are only two drivers with DaynaPort support: dse and my own "pse". I only have used pse under NetBSD 3.1. Dse is implemented only for recent NetBSD releases (9/10), is only available in source form, and I have neither built nor run it (in fact, I currently can't build any NetBSD 9 kernel on any of my hardware).

This is all rather strange. Anyway, I'm waiting for your feedback on the issue_5 branch. You are the only one who can test it ;-).

I can now confirm that the issue_5 branch doesn't work with Mac0S - an ethernet device is seen but tcpip doesn't initialize. But it does work with my driver on NetBSD 3.1 .. although I'm seeing some bad packet lengths and drops for an unknown reason.

uweseimet commented 11 months ago

@speakers-k64 Can you please compare this behavior with the scsi2pi develop branch, and if this also causes issues with MacOS, with the piscsi 23.11.01 branch? Thank you. We definitely need a codebase that properly works with MacOS as a starting point for any changes required for NetBSD. Any results on the number of INQUIRY commands sent by NetBSD and MacOS? You mentioned that there might be more than one.

speakers-k64 commented 11 months ago

I've now tested the issue_5 branch, develop branch and the piscsi 20231101 tree. I've also tested with both my pse driver with the NetBSD 3.1 kernel, and the released dse driver in both 3.1 and 9.1:

Branch issue_5 develop 20231101 OS MacOS 7.6 - OK OK pse/3.1 OK - - dse/3.1 OK - - dse/9.1 OK - -

In summary, MacOS requires the send delay and NetBSD requires no delay.

In all cases the scsi device is discovered via the INQUIRY but packets are not received correctly -- both NetBSD drivers report invalid and large packets.

I will capture a log of the INQUIRY sequence bot both MacOS and NetBSD.

uweseimet commented 11 months ago

@speakers-k64 Thank you for spending time on testing. I have run tests with the develop branch and the Atari, and it's (still) working for this platform. This means that we have two platforms where the develop and 20231101 code is definitely known to be working (Mac and Atari), and for NetBSD we know that it is not working. Since you provided your first set of NetBSD patches for PiSCSI there were no changes in the DaynaPort code logic, as far as I can recall, except for changing the packet size from 64 to 128. Are you sure that the delay is the only issue with NetBSD? If this is the case knowing how INQUIRY is used would indeed be important.

speakers-k64 commented 11 months ago

Attached are trace logs for:

  1. The 20231101 piscsi tree: piscsi.20231101.log
  2. The issue_5 branch of the s2p tree: piscsi.issue_5.log

Each with MacOS 7.5 being booted to Finder followed by NetBSD 3.1 + dse driver being booted to the multi-user login prompt.

In the first case, MacOS networking is fully functional but NetBSD sees input errors. In the case, MacOS networking does not work but NetBSD is mostly functional (some input errors occur).

speakers-k64 commented 11 months ago

Since you provided your first set of NetBSD patches for PiSCSI there were no changes in the DaynaPort code logic, as far as I can recall, except for changing the packet size from 64 to 128. Are you sure that the delay is the only issue with NetBSD? If this is the case knowing how INQUIRY is used would indeed be important.

Yes, the delay is now the only issue. Here is my patch for the piscsi develop tree which addresses solely this issue. develop.patch.txt

uweseimet commented 11 months ago

@speakers-k64 I wil revisit your patch latest next week. As already mentioned this patch fails with a multi-initiator environment, but most likely the DaynaPort (also the real device) would not work in such an environment anyway, because it is not stateless (due to its MAC address and buffered network packets). Hard drives, on the other hand, explicitly support such an environment. Regarding the DaynaPort I just merged a fix for #12 (the PiSCSI counterpart is https://github.com/PiSCSI/piscsi/issues/598) into develop, so that it is now possible to have more than one Pi with a DaynaPort emulation in the same network because the DaynaPort's MAC address is not static anymore but derived from the MAC address of the bridge interface.

I will get back to you as soon as I have a new branch, based on your patch, ready for testing. In the meantime, can you please check what the minimum delay is that NetBSD can still cope with? Currently the delay is 100.000 ns, but maybe the NetBSD drivers work with 50.000 or 70.000 or so ns. I am asking this because there is very likely a delay that both NetBSD and slow Macs are fine with, because a real DaynaPort does not have dynamic delays. This is why it would be helpful to know where the limits for the respective drivers are.

uweseimet commented 11 months ago

@speakers-k64 There is an "issue_5_2" branch now which contains a slightly modified variant of your patch. Telling the compiler lies about whether a method is const or not is not such a good idea ;-). If InquiryInternal() cannot be const anymore I don't like it, but so be it. Please test and please remember to provide information on which delay is still acceptable for NetBSD. Thank you.

speakers-k64 commented 11 months ago

@speakers-k64 There is an "issue_5_2" branch now which contains a slightly modified variant of your patch. Telling the compiler lies about whether a method is const or not is not such a good idea ;-). If InquiryInternal() cannot be const anymore I don't like it, but so be it.

Yeah - my patch is a hack.

Please test and please remember to provide information on which delay is still acceptable for NetBSD. Thank you.

Your issue_5_2 branch works fine for MacOS .. ftp transfers are good. For NetBSD, the network comes up, but, as I reported before, I see a high number of packet errors reported and ftp transfers typically stall. These errors prevent me from being able to test with a different delay since I need a clean baseline.

I don't see this behavior using the piscsi develop branch (with my patch applied). So I'll try to compare trace logs to see if I can pinpoint how/where these errors originate.

uweseimet commented 11 months ago

@speakers-k64 OK, thank you for the feedback. The PiSCSI branch should not differ that much. As far as I remember the main difference is commented out code, which I removed. There was also code that was not commented out but was never executed because it depended on a condition that was always true (or always false). This code I also removed. It is strange that code that works for MacOS and the Atari causes issues with NetBSD.

One more thing, even though it is very unlikely that this change causes the issues you observe: In buses/rpi_bus.h there is a preprocessor switch "NO_IRQ_DISABLE", which is enabled by default. If you don't find any issue in the actual daynaport code you might consider changing this switch in order to check whether this changes anything. Another difference is the dynamic calculation of the daynaport's MAC address, but this is just a formal thing and should not have any negative effect, because this address is set once and then remains unchanged.

speakers-k64 commented 11 months ago

One more thing, even though it is very unlikely that this change causes the issues you observe: In buses/rpi_bus.h there is a preprocessor switch "NO_IRQ_DISABLE", which is enabled by default.

No, that's it. It smelled like that sort of random flakiness so I commented out that definition(s) in the header and rebuilt, and we're good.

I can now build with non-zero send delays and see what the effect is.

uweseimet commented 11 months ago

@speakers-k64 Good news! In the issue_5_2 branch I changed the default setting and added some comments, which might be helpful for future changes. It would have been nice to keep the current setting because it might make porting to other hardware easier, but obviously it is is not that simple. I will wait for your feedback regarding the delays before merging into the develop branch. Maybe in the future it turns out that there is kind of a common denominator, a timing that is both fine for NetBSD and slow Macs, so that all drivers work with the same settings. This is why I would like to gather this information.

Happy new year!

speakers-k64 commented 11 months ago

@uweseimet I've pulled your changes to the issue_5_2 brach and can confirm that it's good for MacOS and NetBSD.

I'll make changes with non-zero send delays and get back to you.

Happy New Year to you too. I'm guessing it's 2024 already for you! I'm in Silicon Valley where it's still 2023 as I write.

speakers-k64 commented 11 months ago

@uweseimet I've experimented with the SendHandShake delay and the bottom line is that there cannot be any delay (including 0 which i will explain) for NetBSD to be reliable. For MacOS, 5us is sufficient.

In the issue_5_2 branch, I set the usual send delay of DAYNAPORT_READ_HEADER_SZ for both MacOS abd NetBSD. Then I modified the nanosleep() arguments in SendHandShake() in cpp/busses/gpio_bus.cpp. Note that 1) the definition of SCSI_DELAY_SEND_DATA_DAYNAPORT_NS in daynaport.h is ignored, and 2) the header dependency for gpio_bus.h was unreliable and so I modified gpio_bus.cpp directly.

The nanosleep delay can be reduced to 5us for MacOS still to function. But setting no delay - by not calling nanosleep() - results in MacOS not working and NetBSD showing signs of life but with multiple input errors. Apparently, the potential rescheduling latency triggered by momentary IRQ disabling is sufficient to cause packet errors for NetBSD.

uweseimet commented 11 months ago

@speakers-k64 Correct me if I am wrong, but this means that there is no way to have reliable DaynaPort support for NetBSD at all, is there? None of the possible code/timing changes resolves your issues, neither for PiSCSi nor for SCSI2Pi, and also your patch does not work reliably? Or is there any configuration that works?

In this case this is a very strong indication that the NetBSD driver does not use a proper ACK/REQ handshake and/or uses much too short timeouts. This would explain why there tend to be less problems the faster the Pi provides data. Am I missing something?

The SCSI_DELAY_SEND_DATA_DAYNAPORT_NS in daynaport.h was an unused duplicate, which is now removed.

Thank you for pointing out the header file issue. There was a typo in the Makefile, related to the dependencies of files in the buses folder.

speakers-k64 commented 11 months ago

@speakers-k64 Correct me if I am wrong, but this means that there is no way to have reliable DaynaPort support for NetBSD at all, is there? None of the possible code/timing changes resolves your issues, neither for PiSCSi nor for SCSI2Pi, and also your patch does not work reliably? Or is there any configuration that works?

Your issue_5_2 branch (selecting a send delay on ENABLE only after a 37 byte INQUIRY has been seen) works fine for NetBSD. My experiments involved setting a send delay for NetBSD and MacOS and trying to find a value that satisfied both. And I couldn't.

In this case this is a very strong indication that the NetBSD driver does not use a proper ACK/REQ handshake and/or uses much too short timeouts. This would explain why there tend to be less problems the faster the Pi provides data. Am I missing something?

The NetBSD scsi bus driver is common to all scsi devices and I'm sure it's performing properly. Perhaps the timeouts are short .. but I'm not familiar with this code at all. My suspicion is that the userspace PISCSI implementation may have too much scheduling latency when it sleeps and this may lead to random excessive delays and consequent timeouts.

uweseimet commented 11 months ago

@speakers-k64 If I understand you correctly, from your pespective we would be good with merging the current issue_5_2 branch into the develop branch?

Regarding scheduling latencies etc. I'm afraid I do not agree. The range of platforms and drivers (on the PiSCSI Wiki) that do not have these issues is rather huge. When there is only one platform/driver that has a particular issue, assuming that this platform/driver causes it is reasonable. At least all other drivers do a much better job. Hard drive transfers do not have any timing issue, do they? If they don't this also indicates that there is something special with the DaynaPort driver. Just like there is something special with the DaynaPort driver for MacOS ;-).

Which Pi version are you using, by the way? If you have more than one Pi one would expect more problems with slower Pis than with faster Pis. And one would expect more problems with PiSCSI than with SCSI2Pi, because with the latter transfers are several percent faster. (At least according to what I have measured with a Pi 4.)

speakers-k64 commented 11 months ago

@speakers-k64 If I understand you correctly, from your pespective we would be good with merging the current issue_5_2 branch into the develop branch?

Yes, please.

Regarding scheduling latencies etc. I'm afraid I do not agree.

How do you then explain the change in behavior I saw with NO_IRQ_DISABLE?

Which Pi version are you using, by the way?

cp@2pi:~ $ uname -a
Linux 2pi 6.1.0-rpi6-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.58-1+rpt2 (2023-10-27) aarch64 GNU/Linux
cp@2pi:~ $ more /proc/cpuinfo 
processor   : 0
BogoMIPS    : 108.00
Features    : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part    : 0xd08
CPU revision    : 3

processor   : 1
BogoMIPS    : 108.00
Features    : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part    : 0xd08
CPU revision    : 3

processor   : 2
BogoMIPS    : 108.00
Features    : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part    : 0xd08
CPU revision    : 3

processor   : 3
BogoMIPS    : 108.00
Features    : fp asimd evtstrm crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part    : 0xd08
CPU revision    : 3

Hardware    : BCM2835
Revision    : c03115
Serial      : 10000000a1fd5a06
Model       : Raspberry Pi 4 Model B Rev 1.5
uweseimet commented 11 months ago

Regarding the IRQs I don't know whether switching them off serves any purpose or not. Just because the old (rascsi) code does it does not mean anything. There was a lot of stuff in the old code that was useless and caused issues instead of resolving them. There was even code that was supposed to make SCSI transfers faster, but in practice it made them slower. This is why I try to at least comment any peculiarities, so that whoever stumbles upon them knows a bit more than I did when looking at the code.

With NetBSD it is not just the IRQs but the timing requirements in general, which seem to differ from other platforms. And also from the SCSI specification, which is the ultimate source of information. This is why I cannot guarantee that I will forever try to support NetBSD for the Mac. If it turns out that the IRQ requirements cause issues with further progress, e.g. with porting to the Pi 5 (don't know if I am going to do that), I would decide in favor of the Pi 5, and not in favor of an exotic OS which has a timing issue no other platform has. I hope that you understand. But for now, and definitely for the SCSI2Pi 1.1 release (most likely later this month) I will merge and keep the code that also works for NetBSD on the Mac. Merging may take until tomorrow, because I am updating some unit tests in the issue_5_2 branch. These tests are not related to the ticket, but I claim that improving tests is something that can be done in any context. And they are even related to BSD, but in a different context: The amd64 BSD compilers are configured a bit differently than the ones on Linux and cause compile-time warnings, which I am currently trying to get rid off.

Thank you very much for helping with testing and with getting more insight into issues related to the DaynaPort and NetBSD. I will close this ticket after merging.

speakers-k64 commented 11 months ago

... Merging may take until tomorrow, because I am updating some unit tests in the issue_5_2 branch ...

There'a no huge hurry since I can build and run the patch myself ;-)

Thank you very much for helping with testing and with getting more insight into issues related to the DaynaPort and NetBSD. I will close this ticket after merging.

Thanks for being so responsive.

uweseimet commented 11 months ago

Merged into develop branch

uweseimet commented 10 months ago

The work-around will be removed with SCSI2Pi 1.2, see https://github.com/BlueSCSI/BlueSCSI-v2/pull/111 for details.