uweseimet / scsi2pi

Advanced performant SCSI/SASI emulation and tools for the PiSCSI/RaSCSI board
https://www.scsi2pi.net
BSD 3-Clause "New" or "Revised" License
10 stars 2 forks source link

Add simh image file support to tape device #100

Open uweseimet opened 1 week ago

uweseimet commented 1 week ago

This is a follow-up ticket to https://github.com/uweseimet/scsi2pi/issues/93. The tape device (SCPT) support in SCSI2Pi 4.0 focuses on reading and writing tar files. This has been tested with tar (Linux) and with Gemar (Atari).

In order to support more sophisticated tape drivers, which for instance make use of filemarks and of (reverse) spacing, a different image file format is required. tar files only contain the actual archived data, but not any other objects like filemarks or end-of-data markers, which are used by some drivers, e.g. bacula.

In the ideal case SCSI2Pi can use the same format that is used by open-simh, see https://github.com/open-simh/simh. See http://simh.trailing-edge.com/docs/simh_magtape.pdf for the specification of this file format.

uweseimet commented 1 week ago

@pacjunk The issue_100 branch now contains initial code that for .tap files supports the simh file format. I have already tested this code with my limited setup. tar can successfully write/read to/from .tap files in the new format. This means that at least consecutive data blocks are dealt with correctly, just like before with the proprietary raw format. I downloaded http://www.bitsavers.org/bits/DEC/vax/vms/TK50/aq-mi45a-be_vms_v4_v5_mand_upd_1988.tap in order to see what happens when I use this image file together with tar. Of course this cannot work, but it reveals that at least the size of the first object (a label I guess) is correctly evaluated:

[2024-11-11 22:28:12.431] [debug] (ID 3) - Controller is executing READ(6)/GET MESSAGE(6), CDB 08:01:00:00:14:00
[2024-11-11 22:28:12.431] [debug] (ID:LUN 3:0) - Device is executing READ(6)/GET MESSAGE(6) ($08)
[2024-11-11 22:28:12.431] [trace] (ID:LUN 3:0) - Position: 0, byte count: 10240
[2024-11-11 22:28:12.431] [trace] (ID:LUN 3:0) - Read simh header with class number 0, marker value/record length 80
[2024-11-11 22:28:12.431] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 0
[2024-11-11 22:28:12.431] [trace] (ID:LUN 3:0) - Actual block length of 80 byte(s) does not match expected block length of 512 byte(s)
[2024-11-11 22:28:12.432] [debug] (ID 3) - MEDIUM ERROR (Sense Key $03), NO ADDITIONAL_SENSE INFORMATION (ASC $00), ASCQ $00

When you start testing (e.g. by labelling a blank "tape") I expect some issues, new ones or familiar ones, we will see ...

Pacjunk commented 1 week ago

Wow, you have been busy overnight! I'll see if I can make some time...

Pacjunk commented 1 week ago

OK, I rebuilt on the issue_100 branch and did some more testing (sorry still haven't connected that tape drive yet)...

I initialised (labelled) a tape (success) init.txt

then I mounted that tape using the label (success) mount.txt

then I did a "dir" on the blank tape (success) dir empty.txt

then I attempted to copy a file to the tape (failed parity errors, device errors) copy file.txt

Another "dir" on the tape (more parity errors etc) dir after copy.txt

I then connected a pre-existing simh image from bitsavers. I mounted the tape (success!) mount simh.txt

Then a "dir" on the tape to check contents (parity errors, but did display the first file name on the tape - not all) dir simh.txt

That is all I had time for tonight. I'll try to get to the physical tape drive soon...

uweseimet commented 1 week ago

@Pacjunk I'm afraid I don't know what a parity error in this context means. The tape code does not deal with parity. My guess is that your driver is not satisfied with the data it sees or there are (still) issues with the tape-related data returned by REQUEST SENSE. But the data returned appear to be better than before now, because on first sight I could not find any attempt to reverse-space anymore in the logs.

I just fixed some issues and improved the logging, please use the latest avallable issue_100 branch for your next test. There is now also initial reverse-spacing support. Reminder: Always add information on the commit ID you were testing.

Please also do not forget to verify your web UI issue (see my comment in the issue_93 ticket about the web UI).

Regarding init.txt there is no command writing the label. Line 105 appears to successfully read it, so it must have been created before. Please ensure that no write operations are missing in the logs. It may make sense, though, to focus on existing simh files first, e.g. your "dir" test. As long as reading data does not fully work, writing data is unlikely to work better.

I noticed that the existing simh files contain private data records, which SCSI2Pi will not be able to process. See the simh file specification for details on private records.

[2024-11-12 19:42:12.985] [trace] (ID:LUN 3:0) - Read simh header with class 5, marker value/record length $0544353 from position 0

This is not necessarily an issue, but it means that there are custom data in the simh files that only the software that created these files can make use of. Or the detection of this type is still a bug in SCSI2Pi.

And one more thing: As far as I can tell you are testing with an optimized build. If this is the case, please test with a debug build, so that assertions are enabled.

Pacjunk commented 1 week ago

Looks like my previous logs were truncated...

I ran some more test with build 99272d6.

Initialise a tape: init2.txt Initialise a tape with erase: init erase.txt Mount (any label) : mount specified.txt Mount (specified label) : mount specified.txt Copy File (still fails) : copy file.txt

According to the help, "parity error" could mean almost anything...

        If this message is associated with a status code returned
                by a request to a magnetic tape driver, one or more of the
                following conditions can cause this error:

                  Attempt to read beyond the logical end of volume
                  Control bus parity error
                  Correctable data error (PE only)
                  Correctable skew (PE only)
                  CRC error (NRZI only)
                  Data bus parity error
                  Format error (PE only)
                  Invalid tape mark (NRZI only)
                  Longitudinal parity error (NRZI only)
                  Map parity error
                  MASSBUS control parity error
                  MASSBUS data parity error
                  Nonstandard gap
                  Read data substitute
                  Uncorrectable error (PE only)
                  Vertical parity error (NRZI only)

How do I do a debug build? I just do "make s2p". I assume there is some switch?

uweseimet commented 1 week ago

@Pacjunk Please refer to https://github.com/uweseimet/scsi2pi/wiki/Compilation-Instructions for build instructions. https://github.com/uweseimet/scsi2pi/commit/99272d61fc62a167166c1f90d0fc726e14649769 is rather old, 25 hours according to git. Please always use the latest available issue_100 branch for new tests. Otherwise you are spending time on outdated code. Please also read my latest comments before starting to test. My recommendation is not to test any write operations for now, but to focus on existing simh files, which you have checked with the new s2psimh tool, please see my previous comments for details, also in https://github.com/uweseimet/scsi2pi/issues/101.

Pacjunk commented 1 week ago

I'm sorry, but given that it takes 3-4 hours to do a build, I'm not constantly checking for updates. I tend to do a rebuild at night and then maybe test the next day. Because of time zone differences, this is when you make your changes so I'm always testing stale code. I will try the debug build. Also you edit your comments and github doesn't send an email for edits, only the original.

I managed to get a couple of tape drives working (had to reinstall in another enclosure), then one drive decided to chew some tapes, so this required dismantling to get the tapes out...

Anyway, these are the logs: (I tested first without a tape loaded, then with a tape loaded)

Sony SDT-5000 : sdt5000.txt

DEC TZ87 in TK50/70 compatibility mode (read only): tz87.txt

DEC TZ87 with TK85 tape: tz87_CT3.txt

I had some problems with that last drive. Not able to initialise tapes for some reason, but seems to read the tape and make all the right noises.

Pacjunk commented 1 week ago

OK, I rebuilt with debug to version 4e1f05a, and tested with the prebuilt simh image.

Mounting the tape (successfull):
mount.txt

Directory listing of tape (failed): dir.txt

Interestingly, this time I didn't get the generic parity error. The OS reported "file or directory lookup failed. magnetic tape position lost"

Is there some sort of utility/option that can "sniff" the traffic going to and from the real drive. Might be interesting to compare it to the emulated drive.

Pacjunk commented 1 week ago

Here is the dump of the file I'm using. It appears to not be in ANSI format. There are 4 files on the tape. VMS054.A, .B, .C, .D. Only the first filename (VMS054.A) is displayed before the error. The filenames are stored next to the data, so you need to wind through the tape to find them. simh dump.txt

uweseimet commented 1 week ago

@Pacjunk I have had a quick look at the latest s2p logs, thank you. I don't think the mount log is required anymore, unless something changes for the worse and mounting fails. In dir.txt I stumble on this:

[2024-11-14 16:47:57.536] [debug] (ID 3) - Controller is executing SPACE(6), CDB 11:00:00:7f:ff:00
[2024-11-14 16:47:57.537] [debug] (ID:LUN 3:0) - Device is executing SPACE(6) ($11)
[2024-11-14 16:47:57.538] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0002000 at position 268
[2024-11-14 16:47:57.538] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 268
[2024-11-14 16:47:57.539] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $4000100 at position 272
[2024-11-14 16:47:57.539] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 272
[2024-11-14 16:47:57.540] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0010001 at position 276
[2024-11-14 16:47:57.540] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 276
[2024-11-14 16:47:57.542] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0000001 at position 280
[2024-11-14 16:47:57.544] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 280
[2024-11-14 16:47:57.544] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0000000 at position 284
[2024-11-14 16:47:57.545] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 1 at position 284
[2024-11-14 16:47:57.545] [trace] (ID:LUN 3:0) - Encountered filemark while spacing over blocks

s2p finds the data record at offset 268, which according to the s2psimh dump makes sense. But then it finds a data record also at offset 272, which is wrong. From the dump we can see that the next data record is at offfset 8468. s2p does not appear to skip the data record correctly and is out-of-sync. I will have a look at this. Maybe you can also compare the dump with the s2p log, because four eyes see more than two.

Instead of the s2psimh dump of your file, can you please provide me a download link for the simh file you are using? I can then create a dump myself, and also have the file available for testing. You say there are 4 files on this tape. Do the filenames appear in the dump? There are 3 short data blocks with what appears to be names in them at the start and 2 at the end. In between there are only regular data blocks of 8192 bytes. Is this the content you would expect? In case you know the names of the files on the tape and they are not included in these short blocks, are they included in the 8192 byte blocks? You can grep for them, e.g. with

s2psimh -d YOUR_TAP_FILE | grep NAME_OF_THE_FILE

If we assume that the filenames are visibile in the ASCII data of the dump, they should be found.

The traffic between your driver and the emulated drive is fully covered by logging on trace level. A tool would not provide more information. It would be nice, though, if there was a tool for your workstation that shows the traffic between your workstation and a real drive. For the Atari there is such a tool, maybe also for your workstation?

Thank you for running the s2pexec commands. The Sony and the DEC drive appear to support reverse-spacing. Since my drive also supports it, this feature appears to be quite common. But that's fine for SCSI2Pi, because I added reverse-spacing about 2 days ago to the issue_100 branch.

Regarding the issue_100 branch it's best if you update and re-compile (s2p and s2psimh are sufficient) before running new tests. Re-compiling will not always take very long, because usually only a few files may have changed since the last build.

Any news on your web UI issue?

uweseimet commented 1 week ago

@Pacjunk I have identified why s2p was out-of-sync. Please update and re-build issue_100 and run the directory test again. The s2p log should reveal the change, i.e. s2p should now skip the data record and find the next object at offset 8468 instead of offset 272. I have successfully tested this by sending the SPACE(6) command that was revealing the bug to a simh file with the in-process test tool on my PC:

>./bin/in_process_tool -s "-i 0 simh.tap"
...
s2pexec>-i 0 -L trace -c 11:00:00:7f:ff:00
...
[19:46:42.782] [debug] (ID 0) - Controller is executing SPACE(6), CDB 11:00:00:7f:ff:00
[19:46:42.782] [debug] (ID:LUN 0:0) - Device is executing SPACE(6) ($11)
[19:46:42.782] [trace] (ID:LUN 0:0) - Read SIMH header with class 0, value $0002000 at position 268
[19:46:42.782] [trace] (ID:LUN 0:0) - Searching for object type 0, found type 0 at position 268
[19:46:42.782] [trace] (ID:LUN 0:0) - Read SIMH header with class 0, value $0002000 at position 8468
[19:46:42.782] [trace] (ID:LUN 0:0) - Searching for object type 0, found type 0 at position 8468
...

The purpose of this SPACE(6) is to skip up to $7fff data records until a filemark (called tape mark in the simh notation) is found. I guess this is because for a directory listing the data records are not relevant. Filename information appears to be stored in the smaller 80 byte blocks, and a series of these blocks is marked with a filemark.

Pacjunk commented 1 week ago

Only a little testing today - got a migraine!

Anyway, the directory listing seems to have worked correctly : dir good.txt

I tried to copy the first (small) file off the tape (VMS054.A), but this failed: copy bad.txt

I think I am using http://www.bitsavers.org/bits/DEC/vax/vms/TK50/aq-jp22f-be_vms_v5.4_bin_1of2_1990.tap for the data. Can't be 100% sure though - the file name is different. I downloaded it a long time ago.

uweseimet commented 1 week ago

So there is gradual progress. I always need to know the commit ID your tests are based on, please remember this.

Your log reveals an issue with the block size:

[2024-11-15 18:23:08.206] [debug] (ID 3) - Controller is executing READ(6)/GET MESSAGE(6), CDB 08:00:00:20:00:00
[2024-11-15 18:23:08.207] [debug] (ID:LUN 3:0) - Device is executing READ(6)/GET MESSAGE(6) ($08)
[2024-11-15 18:23:08.207] [trace] (ID:LUN 3:0) - Current position: 268, requested byte count: 8192
[2024-11-15 18:23:08.208] [trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0002000 at position 268
[2024-11-15 18:23:08.208] [trace] (ID:LUN 3:0) - Searching for object type 0, found type 0 at position 268
[2024-11-15 18:23:08.209] [trace] (ID:LUN 3:0) - Actual block length of 8192 byte(s) does not match expected length of 512 byte(s)
[2024-11-15 18:23:08.209] [debug] (ID 3) - MEDIUM ERROR (Sense Key $03), NO ADDITIONAL_SENSE INFORMATION (ASC $00), ASCQ $00

Your system expects a default block size of 8192 bytes. The latest issue_100 branch supports this. Please test again after updating your sources and re-compiling. Ensure that your emulated tape uses a block size of 8192 bytes by adding "-b 8192" to the device parameters when you launch s2p.

If you are not sure which file you are using, please re-download it and compare it with your renamed file (e.g. with diff or bcmp), or use a different file. We have to ensure that I use the same file as you for my tests. As soon as it is ensured that we are using the same file, I would be interested in how the directory listing looks like. As you can see in my last comment I can replay the commands used by your driver with the in_process_tool, but I have to do this on the same file as your driver in order to get meaningful results.

When you say that you copy a file off the tape, I assume this is equivalent to what "tar xf" does, i.e. extracting the archive contents and writing them to the filesystem?

Pacjunk commented 1 week ago

Sorry, but I didnt get the build number. It was the current version immediately prior to my testing.

I have confirmed that the link for the .tap file above is the correct one. The directory listing dump is against that file.

The tape contains "save sets", which in unix speak is an uncompressed tar. I believe the default record size for these is 8192, but I believe it is configurable up to 32256. I will check the documentation.

I tried to copy one of these "tar" files off the tape, not extract it. This is an allowed operation.

Tapes can be mounted in the files11 format (which is what I have done), then the files can be operated on like any other disk files (except no directories). They can also be mounted as "foreign", which is for interchange with other systems or for creating/extracting these save sets directly.

The first format is the easiest if you just want to copy a handful of files on or off the tape without having to worry about packing them up into a saveset. The latter is better when you have thousands of files to copy.

uweseimet commented 1 week ago

@Pacjunk Thank you for providing information on the copy operation.

The newer the device, the bigger the supported block sizes appear to be. Current drives support sizes of 512 KB or maybe even more. With SCSI hard drives and MOD SELECT, or with the "-b" option, any size that is a multiple of 4 can be set up to the maximum supported size, which for tape drives is 8192 bytes now.

I will try to find out whether your driver tries to change the block size with one of the MODE SELECT commands it sends, but for now using "-b 8192" enforces the required size manually.

You can use "git log --abbrev-commit" to get the current commit ID of your checked out git branch, by the way.

Pacjunk commented 1 week ago

Checked the doco. You can specify a block size between 2048 and 65,535 bytes. and also The default block size for magnetic tape is 8192 bytes; the default for disk is 32,256 bytes. This is where I heard the 32256 figure. I believe if you create the save set (tar) on disk, then copy to tape it will be 32256. If written directly to tape, then 8192. The .tap files we are dealing with have been written directly to tape so 8192 should be fine.

Of course this block size is that within the file container. Standard on disk block size is still 512. (and 2048 for tapes)

uweseimet commented 1 week ago

@Pacjunk Thank you for checking this. I will try to make the SCSI2Pi block size support even more flexible than it already is. But for your next test the issue_100 branch is already prepared.

I checked the MODE SELECT statements: Your driver does not try to change the block size. But I missed that your driver does NOT set the Fixed flag in the READ(6) command, i.e. it explicitly tells the drive how many bytes to read: 8192. For your testing this means that despite of what I said before, with the latest issue_100 branch you should not need the "-b 8192" option. Please first run a test without this option, just like your tests before. Only if this test fails, run your test a second time with the "-b 8192" option.

Regarding log files, as long as the "dir" test does not fail there is no need to attach a log for the "dir" test. From now on, please always add -l "[%l] %v" to your s2p startup options. This log pattern will make the log files more compact and easier to read because it removes the timestamps, which are irrevelant for these tests. You can also set the pattern permanently in /etc/s2p.conf with:

log_pattern=[%l] %v
uweseimet commented 1 week ago

Except for having to be a multiple of 4 bytes (according to the SCSI standard) the tape drive block size is now completely flexible. When there are read/write requests for any block size (this size can be bigger than the default block size), s2p transfers and reads/writes the data in multiples of the default block size. This is reflected in the log: The bigger the default tape block size, the less read/write operations are needed. But the final result is always the same. For 8192 byte records in a simh file this means that when the SCTP device block size is 512 bytes, s2p splits the 8192 byte transfer into 16 chunks. When the SCTP device block size is 8192 bytes, a record is transferred in a single chunk, yielding the same effective result. For the user this is fully transparent.

Pacjunk commented 1 week ago

I cannot compile now. I start the compilation and it goes for probably 90 minutes on the one file then:

g++ -O0 -g -DDEBUG -std=c++20 -iquote . -D_FILE_OFFSET_BITS=64 -DFMT_HEADER_ONLY -DSPDLOG_FMT_EXTERNAL -MD -MP -Wall -Wextra -Wno-psabi  -DBOARD_FULLSPEC -DBUILD_SCHD -DBUILD_SCMO -DBUILD_SCCD -DBUILD_SCTP -DBUILD_SCDP -DBUILD_SCLP -DBUILD_SCHS -DBUILD_SAHD -DBUILD_STORAGE_DEVICE -DBUILD_DISK -c command/command_dispatcher.cpp -o obj/command_dispatcher.o
g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.

It seems to use most of the CPU and memory.

I followed your installation instructions and added: export CXXFLAGS="clang++" This just generated errors as it just adds clang++ as a parameter, so you get "g++ clang++ ..." and of course g++ doesn't know what to do with this.

I tried grabbing one of the build lines and substituted g++ with clang++ and this worked to build that one module.

I don't know enough about makefiles to understand how to fix this, so I just edited the Makefile to change g++ to clang++

I'm yet to see if this works.

Pacjunk commented 1 week ago

Build seems to complete (although it seemed to rebuild nearly everything). I will test later...

Pacjunk commented 1 week ago

Well that went badly - Can't even mount the tape now (can't find the label). Build c826343

Blocksize unset : mount1.txt

Blocksize set to 8k : mount8k.txt

uweseimet commented 1 week ago
[trace] (ID:LUN 3:0) - Read SIMH header with class 0, value $0000000 at position 264
[debug] (ID 3) - MEDIUM ERROR (Sense Key $03), READ ERROR (ASC $11), ASCQ $00

I'll check this.

Regarding clang++ I just updated the Wiki:

make CXX=clang++

Using clang++ instead of g++ compiles faster. I always use clang++ except for building the binary distributions. Note that after changing the compiler running "make clean" is required.

uweseimet commented 1 week ago

Well that went badly - Can't even mount the tape now (can't find the label). Build https://github.com/uweseimet/scsi2pi/commit/c82634332d422c314cca29bcd3ec2f31654caa3b

This should be fixed with the latest issue_100 branch. I could reproduce this problem with the in process test tool and the same .tap file that you are using. For the test I extracted all commands sent by your test and prefixed them with "-c " (for use with the tool) with

sed -n "s/.*CDB \(.*\)$/-c \1/p" LOGFILE.txt

I removed all commands that are irrelevant for this test like INQUIRY or TEST UNIT READY and set up a sequence of commands in a file "simh.tst" with this content:

-i 0 -L trace
-c 08:00:00:00:50:00
-c 08:00:00:00:50:00
-c 08:00:00:00:50:00
-c 08:00:00:00:50:00
-c 11:00:00:7f:ff:00
-c 08:00:00:00:50:00

I run the test on my Linux PC with

./bin/in_process_tool -s "-i 0 simh.tap" < simh.tst

This way I send the commands in the simh.tst file to s2p. This produces essentially the same s2p logfile as your test, and I can compare the results. In the case of the issue you reported I can see that with the fixed issue_100 branch the results appear to be correct again. Let's see whether your system confirms that ;-).

Pacjunk commented 1 week ago

OK, using build 3a27cf7, mount now works again, and I successfully did a directory listing of the tape.

Copying of a file from the tape still fails: copy.txt

If I then do a directory listing of the tape (after the copy), it also fails : dir.txt

Don't know why it works before the copy, but fails afterwards.

uweseimet commented 1 week ago

@Pacjunk OK; thank you. Before getting back to you I will first work on #102, because with this functionality testing will become much easier. You can then simply generate a script file with all commands sent by your driver, and I can re-play these commands on my PC, as if I were using the same hardware/driver as you. At least that's the idea of #102.

uweseimet commented 6 days ago

@Pacjunk I could easily reproduce your issue, because the commands sent by your driver do not vary much. These two s2pexec commands reproduce the bug:

-c 11:00:00:7f:ff:00
-c 08:00:00:20:00:00

Translation: Skip all filemarks and then read the first data record.

The latest issue_100 branch fixes the bug. Compiling will take longer again this time because some central header files have changed.

issue_100 now also contains the changes for #102, which will make testing much easier. In order to profit from these changes, this is what I would like you to do:

  1. Update your log pattern (either on the command line or in /etc/s2p.conf) to "[%^%l%$] %v".
  2. During testing always have s2p create a script for s2pexec, with all commands and data received. The new -s option will do that (please read #102) for details:
    s2p -s /tmp/script.txt ...

    Like with any command line parameter, you can add a property to /etc/s2p.conf instead:

    script_file=/tmp/script.txt
  3. Whenever something goes wrong, please attach the logfile and the matching script file. When I execute this script with the same image file as you, I should run into exactly the same problem as you. If this works as expected it will be a big help with testing.
Pacjunk commented 6 days ago

I get a failure when trying to compile...

clang++ -O0 -g -DDEBUG -std=c++20 -iquote . -D_FILE_OFFSET_BITS=64 -DFMT_HEADER_ONLY -DSPDLOG_FMT_EXTERNAL -MD -MP -Wall -Wextra -Wno-psabi  -DBOARD_FULLSPEC -DBUILD_SCHD -DBUILD_SCMO -DBUILD_SCCD -DBUILD_SCTP -DBUILD_SCDP -DBUILD_SCLP -DBUILD_SCHS -DBUILD_SAHD -DBUILD_STORAGE_DEVICE -DBUILD_DISK  -o bin/s2p obj/s2p_core.o obj/s2p_parser.o obj/s2p_thread.o obj/s2p.o lib/libcommand.a lib/libbus.a lib/libcontroller.a lib/libdevice.a lib/libshared.a  -lpthread -lprotobuf
clang: error: unable to execute command: Killed
clang: error: linker command failed due to signal (use -v to see invocation)
make: *** [Makefile:390: bin/s2p] Error 254
uweseimet commented 6 days ago

@Pacjunk This appears be an issue with not enough memory for linking. It is not the compile but the link stage. Please try g++ instead of clang++, or try an alternative linker as explained on https://github.com/uweseimet/scsi2pi/wiki/Compilation-Instructions. Or, at least temporarily, edit the Makefile, search for "-g" and remove it. Note that in any case execept for an alternative linker you need to run "make clean" before trying again. An alternative linker might need even more memory, though.

Pacjunk commented 6 days ago

I changed to clang because g+ was using too much memory. I'm looking at configuring a spare Pi 3B to do the job. I was going to use the 64 bit pre-built image for PiSCSI, then loading SCSI2Pi over the top. Is this the best option, or should I use 32bit, or another flavour of raspbian?

uweseimet commented 6 days ago

@Pacjunk When omitting the -g option I assume it will work, i.e. you should give this a try. Whether 32 bit needs less memory I cannot tell. If you have any other platform to compile the right (binary format) binaries with, this also is an option. A cross-compiler is the fastest option, if you have one. Some x86/x86_64 Linux distributions offer cross compilers for the arm architecture. Personally, I either cross-compile or compile on a Pi 4. But before a release I also ensure that a Pi Zero still works for an optimized build. (But only for releases, not during development.) But the optimized builds do not contain assertions, which is a problem during the development phase. At the current phase of testing I think we can do without assertions, i.e. an optimized build should also be fine. It will produce smaller object files, which will reduce the memory needed for linking.

Edit: In case you set up a new Pi, I recommend using bookworm. The support of bullseye will end next year. If you use a PiSCSI base image you may have to stick to bullseye, though. Also note https://github.com/PiSCSI/piscsi/issues/1481 in this case.

Pacjunk commented 6 days ago

Removing -g didn't help. Currently rebuilding without debug. I suppose 32 or 64 bit doesn't matter too much. The 3B only has 1GB memory.

uweseimet commented 6 days ago

1 GB should be enough. The build failed on a 512 MB Pi, didn't it? What should also work on a 512 MB Pi is adding some swap space, e.g with a swap file of about 512 MB.

Pacjunk commented 6 days ago

Yep 512. ZeroW. Reluctant to add swap - shortens life of SD card! (and makes things even slower)

uweseimet commented 6 days ago

Yes, swap space is never that nice, but sometimes a last resort.

uweseimet commented 5 days ago

@Pacjunk When resuming your tests please ensure to update to the latest available issue_100 branch. It contains fixes for exotic functions (but your driver might use them, you never know). The logging of the tape position has also been improved, which is important because many of the issues you reported were caused by s2p being out-of-sync with the file contents.

Pacjunk commented 5 days ago

Compiled OK without debug. Build e31d416

Problem is it's broken again. Cannot mount the tape...

mount.txt script.txt

uweseimet commented 5 days ago

@Pacjunk Has there been any change with your setup? When I compare the log of the successful mount with the latest log I find that a READ POSITION command in the new log at line 155, whereas in the old log in the corresponding line 157 there is a MODE SENSE. Before this line all data transferred before are identical. This IMO means that nothing has effectively changed as far as the behavior of s2p is concerned, but nevertheless your driver sends different commands.

Pacjunk commented 5 days ago

No doing exactly the same. Only thing different is compiling without debug. I can run the test again if you like.

uweseimet commented 5 days ago

Debug should not matter, but if nothing else has changed this would have to be verified. Even though it takes some time I would like to ask you to run the mount test again with commit ID 3a27cf7 and to attach the log. I recommend to do this in a separate folder, e.g. like this:

cp -r -p scsi2pi.git scsi2pi.git.test
cd scsi2pi.git.test
make clean
git checkout 3a27cf7

Then run make with the same settings you used last time for your current build. Now we see why it is so important to always know the commit ID ;-).

Note that the latest s2p warns about unknown parameters, and you are still using "caching_mode=write-through:3" as a global parameter, which is invalid. I suggest you remove this parameter. Caching is only supported for hard drives, and in this case the syntax is "device.3.caching_mode=". See https://www.scsi2pi.net/en/properties.html for the syntax.

What's also strange is that the logs start to differ very early, even before any data blocks or filemarks have been dealt with. There has not yet been any transfer of data from the image file.

The only commands sent by your driver before the logs differ are TEST UNIT READY, INQUIRY and MODE SENSE. None of these command and the respective responses has changed, so I really wonder what causes a READ POSITION to be sent this time.

Pacjunk commented 5 days ago

Just ran the test again. Logs attached: mount.txt script.txt

uweseimet commented 5 days ago

So no change, i.e. my previous comment applies, i.e. this needs to be verified with commit ID 3a27cf7 because the logs say that s2p does the same as before,

Pacjunk commented 5 days ago

If I do git checkout 3a27cf7 it says that I'm in headless mode. Shouldn't it replace some files?

uweseimet commented 5 days ago

It has, it is not at the HEAD of the source tree anymore, that's what the message means. You can check the current commit ID with " git log --abbrev-commit".

Pacjunk commented 5 days ago

OK, just used to git pull displaying changed files. Thought it might display something similar. Apparently not. Building now - if I'm lucky it might finish before midnight!

uweseimet commented 5 days ago

Just take your time. Just to ensure that we are in sync: The idea is to verify that it is reproducable that your driver interacts differently with the two commits. The logs say that the behavior of s2p has not changed, and they contain the complete flow of data. So what has actually changed, or what I am missing that's not in the logs? It is good now to have logs without timestamps, so that they are directly comparable with tools like diff.

Pacjunk commented 4 days ago

OK, rebuilt with 3a27cf7 and it is back to the previous behaviour. Mount works, directory works, file copy fails, subsequent directory fails. mount.txt mount script.txt copy.txt copy script.txt

uweseimet commented 4 days ago

Thank you for re-compiling.

At least the results are consistent. I have no idea yet why your driver uses different commands even though the logfiles do not indicate any difference, up to the point where your driver suddenly uses READ POSITION. I will get back to you as soon as I have figured out what might be happening.

By the way, the old code (ID 3a27cf7) might behave better when you use the "-b 8192" startup option. Can you please give it a try? Please do not run the "copy" test before the "dir" test succeeds. The simplest tests should be executed first, and if they fail the more complicated ones do not need to be executed.

uweseimet commented 4 days ago

There is progress again, but let me ask a question first: Can it be that your driver remembers something about the state of a drive? Does it "know" that after re-starting s2p the tape is effectively a different one, just as if you had switched off and on a real tape drive? If the driver remembers data (e.g positions it has previously read from the tape), the information the driver has is out of sync in such a case, which might explain a different behavior, depending on how the tests are run. Unmounting the tape should fix that.

Anyway, I found that the additional commands sent by the driver were not causing the mount test to fail. There was a read error later, which caused the mount operation to fail. Using your scripts and logfiles I could reproduce the bug, and with the latest issue_100 branch it is fixed, at least as far as I can test it with my setup. There are also more unit tests now dealing with changing positions on a drive. There is a lot of navigation over SIMH objects going on, and it is not so easy to get everything right, like skipping markers or blocks, also for the reverse navigation cases. I also have to ensure that if somebody digs out another tape image file format SCSI2Pi is prepared to support it, i.e. that the code does not become too SIMH-specific.

I am curious what the next tests will reveal.

Pacjunk commented 4 days ago

229145b Still no go. Note that the mount is with a clean boot of the OS - no prior operations on that device.

mount.txt mount script.txt

uweseimet commented 4 days ago

@Pacjunk There is still a MEDIUM ERROR being reported in line 780. I can reproduce this with your script.