PiSCSI / piscsi

PiSCSI allows a Raspberry Pi to function as emulated SCSI devices (hard disk, CD-ROM, and others) for vintage SCSI-based computers and devices. This is a fork of the RaSCSI project by GIMONS.
https://piscsi.org
BSD 3-Clause "New" or "Revised" License
537 stars 82 forks source link

When ID1 is specified at startup, RaSCSI fails with duplicate ID #101

Closed akuker closed 3 years ago

akuker commented 3 years ago

Info

Describe the issue

If I specify -ID1 in my service's ExecStart, the service fails to load, claiming a duplicate ID (it isn't). All other IDs appear to work. position doesn't appear to matter. Manually attaching a device to ID 1 works fine when done through rasctl.

jeremybernstein commented 3 years ago

Clarification: when ID1 and ID2 are both specified at startup, RaSCSI fails with duplicate ID

I'll post some working/non-working examples shortly.

jeremybernstein commented 3 years ago

ExecStart=/usr/local/bin/rascsi -ID1 /home/pi/images/600MB.hda -ID2 /home/pi/images/EIII_Factory_Sounds_Vol_1.iso doesn't work.

ExecStart=/usr/local/bin/rascsi -ID1 /home/pi/images/600MB.hda -ID3 /home/pi/images/EIII_Factory_Sounds_Vol_1.iso does.

BotchedRPR commented 3 years ago

See my comment in troubleshooting.

ID0 + ID1 = working ID1 + ID2 = broken So it's not the cause of ID1, it's more likely ID2! (correct me if I'm wrong)

jeremybernstein commented 3 years ago

No, because -ID0 + -ID2 works.

phrax0 commented 3 years ago

Interesting failure pattern...

first ID    second ID   result
0       0       last one wins
0       1       good
0       2       good
0       3       good
0       4       good
0       5       good
0       6       good
1       0       good
1       1       last one wins
1       2       2: duplicate ID
1       3       good
1       4       good
1       5       good
1       6       good
2       0       good
2       1       good
2       2       last one wins
2       3       good
2       4       4: duplicate ID
2       5       good
2       6       good
3       0       good
3       1       good
3       2       good
3       3       last one wins
3       4       good
3       5       good
3       6       6: duplicate ID
4       0       good
4       1       good
4       2       good
4       3       good
4       4       last one wins
4       5       good
4       6       good
5       0       good
5       1       good
5       2       good
5       3       good
5       4       good
5       5       last one wins
5       6       good
6       0       good
6       1       good
6       2       good
6       3       good
6       4       good
6       5       good
6       6       last one wins

If you specify ID1 and ID2, it fails. But putting ID2 then ID1, it works. Same as 2, 4 fails, but 4, 2 works. And 3, 6 fails, but 6, 3 works.

phrax0 commented 3 years ago

This appears to be a problem in rascsi.cpp around line 923.

if (id < 0) {
        fprintf(stderr, "%s: ID not specified\n", optarg);
        return false;
} else if (disk[id] && !disk[id]->IsNULL()) {
        fprintf(stderr, "%d: duplicate ID\n", id);
        return false;
}

when using IDs 1+2, or 2+4, or 3+6, disk[id] gets populated after the first ID is parsed. Adding some logging just before that section reveals...

pi@raspberrypi:~/RASCSI/src/raspberrypi $ sudo  bin/fullspec/rascsi -ID1 ./harddrive.hda -ID2 ./cdrom.iso
SCSI Target Emulator RaSCSI(*^..^*) version 21.99 --DEVELOPMENT BUILD-- (May 31 2021, 00:18:02)
Powered by XM6 TypeG Technology / Copyright (C) 2016-2020 GIMONS
Connect type : FULLSPEC

id is [1] - disk[id] is [0] - optarg is [./harddrive.hda]
[2021-05-31 00:18:35.525] [info] rasctl added new SCHD device. ID: 1 UN: 0

id is [2] - disk[id] is [0x161df50] - optarg is [./cdrom.iso]
2: duplicate ID

I haven't figured out what is setting disk[2] to something. That value is different for every execution, so it's possibly a pointer?

phrax0 commented 3 years ago

commenting out this section allows the image to be mounted.

if (id < 0) {
        fprintf(stderr, "%s: ID not specified\n", optarg);
        return false;
//} else if (disk[id] && !disk[id]->IsNULL()) {
//        fprintf(stderr, "%d: duplicate ID\n", id);
//        return false;
}

however, with the code in place, trying to mount the same ID doesn't fail as we might expect it to. I'm not exactly sure what this code block is supposed to do.

pi@raspberrypi:~/RASCSI/src/raspberrypi $ sudo  bin/fullspec/rascsi -ID1 ./harddrive.hda -ID1 ./cdrom.iso
SCSI Target Emulator RaSCSI(*^..^*) version 21.99 --DEVELOPMENT BUILD-- (May 31 2021, 00:19:29)
Powered by XM6 TypeG Technology / Copyright (C) 2016-2020 GIMONS
Connect type : FULLSPEC

id is [1] - disk[id] is [0] - optarg is [./harddrive.hda]
[2021-05-31 00:28:32.690] [info] rasctl added new SCHD device. ID: 1 UN: 0

id is [1] - disk[id] is [0] - optarg is [./cdrom.iso]
[2021-05-31 00:28:32.690] [info] rasctl added new SCCD device. ID: 1 UN: 0

[2021-05-31 00:28:32.691] [info] +----+----+------+-------------------------------------
[2021-05-31 00:28:32.691] [info] | ID | UN | TYPE | DEVICE STATUS
[2021-05-31 00:28:32.691] [info] |  1 |  0 | SCCD | ./cdrom.iso(WRITEPROTECT)
[2021-05-31 00:28:32.691] [info] +----+----+------+-------------------------------------
jeremybernstein commented 3 years ago

Thanks for investigating. I also looked at the code when reporting the bug and couldn't see where (id 2) was getting assigned a value. It's a bit of a mystery, although your findings might help. In fact... check out what's happening in MapControler() [sic] -- this looks suspicious and could lead to an (id2) being inadvertently populated (since UnitNum is #defined as 2). Maybe I can isolate the cmd parsing and build this for debugging on macOS and find the issue quicker, and without needing to guess. :-)

    // Replace the changed unit
    for (i = 0; i < CtrlMax; i++) {
        for (j = 0; j < UnitNum; j++) {
            unitno = i * UnitNum + j;
            if (disk[unitno] != map[unitno]) {
                // Check if the original unit exists
                if (disk[unitno]) {
                    // Disconnect it from the controller
                    if (ctrl[i]) {
                        ctrl[i]->SetUnit(j, NULL);
                    }

                    // Free the Unit
                    delete disk[unitno];
                }

                // Setup a new unit
                disk[unitno] = map[unitno];
            }
        }
    }
phrax0 commented 3 years ago
if (id < 0) {
        fprintf(stderr, "%s: ID not specified\n", optarg);
        return false;
//} else if (disk[id] && !disk[id]->IsNULL()) {
//        fprintf(stderr, "%d: duplicate ID\n", id);
//        return false;
}

I mucked around with this a bit more. That code block above also doesn't prevent you from attaching an image via rasctl to an ID that has something assigned already. I honestly don't know what it's supposed to do.

@jeremybernstein going to dig into your suggestion now.

jeremybernstein commented 3 years ago

I haven't had a chance to investigate this, but the MapControler() code appears to be written with the assumption of disk pairs at disk[N] and disk[N+1], whereas the parsing code is using disk[N] with a simple consecutive index. One is presumably incorrect.

phrax0 commented 3 years ago

@jeremybernstein I think you're on the right track. There is something related to how unitno is incremented. I think it's related to the SCSI/SASI controller stuff. Sorry for the vague explanation, but here is the output with some debug statements included...

pi@raspberrypi:~/RASCSI/src/raspberrypi $ sudo  bin/fullspec/rascsi -ID1 ./cdrom.iso -ID3 ./foo.hda
SCSI Target Emulator RaSCSI(*^..^*) version 21.99 --DEVELOPMENT BUILD-- (May 31 2021, 23:03:23)
Powered by XM6 TypeG Technology / Copyright (C) 2016-2020 GIMONS
Connect type : FULLSPEC
# we're in MapControler now
i is [0] UnitNum is [2] j is [0]
unitno = 0 * 2 + 0 = [0]
i is [0] UnitNum is [2] j is [1]
unitno = 0 * 2 + 1 = [1]
i is [1] UnitNum is [2] j is [0]
unitno = 1 * 2 + 0 = [2]
# Setup a new unit - unitno is [2] and disk[2] contains [0]
i is [1] UnitNum is [2] j is [1]
unitno = 1 * 2 + 1 = [3]
i is [2] UnitNum is [2] j is [0]
unitno = 2 * 2 + 0 = [4]
i is [2] UnitNum is [2] j is [1]
unitno = 2 * 2 + 1 = [5]
i is [3] UnitNum is [2] j is [0]
unitno = 3 * 2 + 0 = [6]
i is [3] UnitNum is [2] j is [1]
unitno = 3 * 2 + 1 = [7]
i is [4] UnitNum is [2] j is [0]
unitno = 4 * 2 + 0 = [8]
i is [4] UnitNum is [2] j is [1]
unitno = 4 * 2 + 1 = [9]
i is [5] UnitNum is [2] j is [0]
unitno = 5 * 2 + 0 = [10]
i is [5] UnitNum is [2] j is [1]
unitno = 5 * 2 + 1 = [11]
i is [6] UnitNum is [2] j is [0]
unitno = 6 * 2 + 0 = [12]
i is [6] UnitNum is [2] j is [1]
unitno = 6 * 2 + 1 = [13]
i is [7] UnitNum is [2] j is [0]
unitno = 7 * 2 + 0 = [14]
i is [7] UnitNum is [2] j is [1]
unitno = 7 * 2 + 1 = [15]
[2021-05-31 23:04:03.583] [info] rasctl added new SCCD device. ID: 1 UN: 0
# we're in MapControler now
i is [0] UnitNum is [2] j is [0]
unitno = 0 * 2 + 0 = [0]
i is [0] UnitNum is [2] j is [1]
unitno = 0 * 2 + 1 = [1]
i is [1] UnitNum is [2] j is [0]
unitno = 1 * 2 + 0 = [2]
i is [1] UnitNum is [2] j is [1]
unitno = 1 * 2 + 1 = [3]
i is [2] UnitNum is [2] j is [0]
unitno = 2 * 2 + 0 = [4]
i is [2] UnitNum is [2] j is [1]
unitno = 2 * 2 + 1 = [5]
i is [3] UnitNum is [2] j is [0]
unitno = 3 * 2 + 0 = [6]
# Setup a new unit - unitno is [6] and disk[6] contains [0]
i is [3] UnitNum is [2] j is [1]
unitno = 3 * 2 + 1 = [7]
i is [4] UnitNum is [2] j is [0]
unitno = 4 * 2 + 0 = [8]
i is [4] UnitNum is [2] j is [1]
unitno = 4 * 2 + 1 = [9]
i is [5] UnitNum is [2] j is [0]
unitno = 5 * 2 + 0 = [10]
i is [5] UnitNum is [2] j is [1]
unitno = 5 * 2 + 1 = [11]
i is [6] UnitNum is [2] j is [0]
unitno = 6 * 2 + 0 = [12]
i is [6] UnitNum is [2] j is [1]
unitno = 6 * 2 + 1 = [13]
i is [7] UnitNum is [2] j is [0]
unitno = 7 * 2 + 0 = [14]
i is [7] UnitNum is [2] j is [1]
unitno = 7 * 2 + 1 = [15]
[2021-05-31 23:04:03.588] [info] rasctl added new SCHD device. ID: 3 UN: 0

+----+----+------+-------------------------------------
| ID | UN | TYPE | DEVICE STATUS
+----+----+------+-------------------------------------
|  1 |  0 | SCCD | ./cdrom.iso(WRITEPROTECT)
|  3 |  0 | SCHD | ./foo.hda
+----+----+------+-------------------------------------

When the function is called for ID 1, unitno is set to 2 for some reason, and disk[2] gets some data in it. When the function runs for ID 3, unitno is set to 6, and presumably disk[6] gets some data in it.

This explains why mapping 1 + 2 gives an error. I haven't pinpointed what is causing this. I'm thinking

Hmmm. UnitNum (Number of units around controller) is set to 2 at the beginning. In the loop, i counts up to UnitNum every time, and it's multiplied into the unitno value. I just dropped UnitNum from 2 to 1, compiled, and so far I am able to mount ID1 and ID2 at the same time.

I'm going to do some more testing, but dropping UnitNum to 1 might be the solution.

phrax0 commented 3 years ago

more testing

pi@raspberrypi:~/RASCSI/src/raspberrypi $ sudo  bin/fullspec/rascsi -ID1 ./cdrom.iso -ID2 ./foo.hda -ID3 ./cdrom2.iso -ID4 ./harddrive.hda -ID5 ./cdrom3.iso -ID6 ./cdrom4.iso 
SCSI Target Emulator RaSCSI(*^..^*) version 21.99 --DEVELOPMENT BUILD-- (May 31 2021, 23:13:48)
Powered by XM6 TypeG Technology / Copyright (C) 2016-2020 GIMONS
Connect type : FULLSPEC
# Setup a new unit - unitno is [1] and disk[1] contains [0]
[2021-05-31 23:14:21.656] [info] rasctl added new SCCD device. ID: 1 UN: 0
# Setup a new unit - unitno is [2] and disk[2] contains [0]
[2021-05-31 23:14:21.656] [info] rasctl added new SCHD device. ID: 2 UN: 0
# Setup a new unit - unitno is [3] and disk[3] contains [0]
[2021-05-31 23:14:21.656] [info] rasctl added new SCCD device. ID: 3 UN: 0
# Setup a new unit - unitno is [4] and disk[4] contains [0]
[2021-05-31 23:14:21.656] [info] rasctl added new SCHD device. ID: 4 UN: 0
# Setup a new unit - unitno is [5] and disk[5] contains [0]
[2021-05-31 23:14:21.657] [info] rasctl added new SCCD device. ID: 5 UN: 0
# Setup a new unit - unitno is [6] and disk[6] contains [0]
[2021-05-31 23:14:21.657] [info] rasctl added new SCCD device. ID: 6 UN: 0

+----+----+------+-------------------------------------
| ID | UN | TYPE | DEVICE STATUS
+----+----+------+-------------------------------------
|  1 |  0 | SCCD | ./cdrom.iso(WRITEPROTECT)
|  2 |  0 | SCHD | ./foo.hda
|  3 |  0 | SCCD | ./cdrom2.iso(WRITEPROTECT)
|  4 |  0 | SCHD | ./harddrive.hda
|  5 |  0 | SCCD | ./cdrom3.iso(WRITEPROTECT)
|  6 |  0 | SCCD | ./cdrom4.iso(WRITEPROTECT)
+----+----+------+-------------------------------------
jeremybernstein commented 3 years ago

Yes UnitNum is #defined as 2. Changing it to 1 "solves" the problem because the consecutive indexing of the parsing corresponds to the (index*UnitNum+curUnit) calculation in MakeControler(). Changing the UnitNum is only appropriate if RaSCSI doesn't support UNs other than 0. So this is maybe a question for @akuker : are LUNs > 0 supported? If not, @phrax0 's fix is probably "good enough", although a more maintainable solution would be to additionally fix the parsing code to take UnitNum into account (even if only UN 0 is ever used for the time being). That way, if LUNs are supported later on, there won't be new surprises.

phrax0 commented 3 years ago

@jeremybernstein agreed. @akuker looking for your comments.

rdmark commented 3 years ago

I tested the known buggy ID combinations, plus a handful of known good ones, in develop 8a3642b. This is does not reproduce for me.

@akuker @jeremybernstein @phrax0 This part of the code has changed significantly since this issue was raised. Would you mind testing again in your environments?

uweseimet commented 3 years ago

@rdmark I just stumbled upon this issue, and I can also not reproduce this anymore. But I know that it was not working with the old codebase. Now it is:

>rascsi -ID1 /home/us/hatari/test.hds -ID2 /home/us/hatari/aranym.hds
+----+----+------+-------------------------------------
| ID | UN | TYPE | DEVICE STATUS
+----+----+------+-------------------------------------
|  1 |  0 | SCHD | /home/us/hatari/test.hds (WRITEPROTECT)
|  2 |  0 | SCHD | /home/us/hatari/aranym.hds
+----+----+------+-------------------------------------

It's disappointing that nobody from the list above re-tested this ...

uweseimet commented 3 years ago

@rdmark I suggest to close this issue, because everybody else seems to have lost interest, and it works for our setups.

rdmark commented 3 years ago

Agreed. This part of the code has changed enough that the original analysis is no longer applicable. Please raise a new ticket if there are still similar issues in certain environments!