XPEnology-Community / redpill-lkm5

redpill-lkm5
GNU General Public License v3.0
11 stars 8 forks source link

@jim3ma, Can GeminiLake, v1000, and r1000 be used as HBAs in lkm4? #14

Open PeterSuh-Q3 opened 1 year ago

PeterSuh-Q3 commented 1 year ago

https://github.com/XPEnology-Community/redpill-lkm5/commit/d7e0766776d4a9a3d62768f493d9d00581546635

@jim3ma I tried applying this new module you added for lkm5 to lkm4 yesterday.

https://xpenology.com/forum/topic/65408-automated-redpill-loader-arpl/?do=findComment&comment=449219

An issue where the disk serial, which is part of S.M.A.R.T information, cannot be displayed when using HBA in DS918+ (Apollo Lake) has been resolved. Thank you so much for adding such a great feature.

I have one question or additional request. The SA6400 is a Device-Tree based model, so is this a function that HBA can use? I haven't been able to test SA6400 yet because I was concentrating on DS918+ in HBA yesterday. If this were possible, I hope that HBA will also be activated in existing Device-Tree based platforms such as lkm4, Gemini Lake, v1000, and r1000. Is it possible?

jim3ma commented 1 year ago

I have compiled many HBA modules for SA6400. I'm sure SA6400 with DSM 7.2 works well with many HBA cards, like LSI and HBA 1000. I think other platforms are same.

PeterSuh-Q3 commented 1 year ago

@jim3ma

What type of HBA did you test? I am testing on an Intel 4th generation with a Dell Perc H200 or H310 which corresponds to SAS2008 and uses the mpt3sas module.

Direct Boot was activated on ARPL-i18n, The disks addon could not be added directly. However, ARPL-i18n seems to have a disk mapping part internally handled by the disks add-on. ( I checked the script and it is already adding it internally. ) https://github.com/wjz304/arpl-i18n/blob/main/files/board/arpl/overlayfs/opt/arpl/ramdisk-patch.sh#L165

The serial port log below was confirmed. The device is listed in the HBA, but is not mapped to model.dtb. Do you think that storage under HBA in DT-based systems such as SA6400 should also be mapped to model.dtb?

serial_port_log.txt

스크린샷 2023-09-03 오전 9 56 10

jim3ma commented 1 year ago

Do you think that storage under HBA in DT-based systems such as SA6400 should also be mapped to model.dtb?

Yes, I have modified HBA drivers to match sd_probe in SA6400.

The kernel source between SA6400 and other machines are not different, but I think HBA is okay for those machines if update HBA drivers to match the synology kernel.

PeterSuh-Q3 commented 1 year ago

It is correct to understand that HBA operates independently of model.dtb.

And, it seems that kernel 4 already has a part related to sd_probe.

https://github.com/search?q=repo%3APeterSuh-Q3%2Farpl-modules%20sd_probe&type=code

This repo also contains kernel 5 sources, but sd_probe is not coded. Could you please tell me the repo where your sources are located? I would like to refer to it.

jim3ma commented 1 year ago

The HBA did not operate model.dtb and only disks add-on does this work after recognized all disks .

The sd_probe in drivers/scsi/sd.c will allocate disk name and detect disk type for all disks include HBA disks.

PeterSuh-Q3 commented 1 year ago

I wanted to reference the code in your repo, As a result of searching with the keywords below, nothing was implemented. ":= sd.o"

스크린샷 2023-09-04 오전 9 09 39

Is it true that you are only giving me instructions? Did you make any special changes to the disks addon for the SA6400? It's the same as ARC's code, right?

I am a beginner who has done module compilation in Ubuntu a few times. I don't know if it's possible, but I'll give it a try following your instructions. Looking at the Linux 4.4.302 kernel source I think sd.c / sd.h / sd_dif.c needs to be newly added. It is not accurate up to sd_dif.c.

To resolve errors during compilation, you may also receive help from ChatGPT. I would also appreciate the help of this repo's collaborators. Let's give it a try.

PeterSuh-Q3 commented 1 year ago

There was no problem compiling with the new sd.c / sd.h / sd_dif.c added, and sd_mod.ko was added as a result. Is this how it should be?

스크린샷 2023-09-04 오후 1 37 13

jim3ma commented 1 year ago

You should compiling HBA drivers to meet sd_probe(drivers/scsi/sd.c)'s request: add syno_disk_type in HBA drivers. You can find syno_disk_type in Synology NAS GPL Source: https://archive.synology.com/download/ToolChain/Synology%20NAS%20GPL%20Source

PeterSuh-Q3 commented 1 year ago

I am referring to the repo below, which was compiled for Kernel 4.4.x, the source source, 4 years ago, as a model.

I think it would be difficult to selectively refer to only the syno_disk_type you mentioned. It is very difficult to get all the relevant parts without missing anything. Are there any good tips?

https://github.com/wellfrogliu/Synology-MT7601u/tree/90ee0df49755fac621e2dc967adcf9fea189bfe2/kernel/linux-4.4.x/drivers/scsi

I imported the entire scsi and commented out some functions that cause compilation errors. Perhaps the ko module will be compiled only for the defined parts.

jim3ma commented 1 year ago

https://global.synologydownload.com/download/ToolChain/Synology%20NAS%20GPL%20Source/7.1.1-42962/purley/linux-4.4.x.txz Download this kernel code, and search syno_port_type in drivers/scsi/mpt3sas.

PeterSuh-Q3 commented 1 year ago

thank you The contents of the toolchain you provided and By referring to syno_port_type that is already included in my code, Let's include it in drivers/scsi/mpt3sas.

https://github.com/PeterSuh-Q3/arpl-modules/blob/main/src/4.x/drivers/scsi/virtio_scsi.c#L811

I'm testing Gemini Lake. I have already completed the first compilation using the guide you provided yesterday. I don't know if it will work.

And, I found a declaration that is already in use in kernel 5.

https://github.com/PeterSuh-Q3/arpl-modules/blob/main/src/5.x/drivers/scsi/mpt3sas/mpt3sas_scsih.c#L10715

I think the same thing can be applied to kernel 4 code. Is that right?

https://github.com/PeterSuh-Q3/arpl-modules/commit/d910009965b493cd9be93dddff8a31fe39f3967c

jim3ma commented 1 year ago

Different platforms use different mechanism, like SA6400 only supports SYNO_PORT_TYPE_SATA, you must verify yourself.

PeterSuh-Q3 commented 1 year ago

The /drivers/scsi subcodes have been modified as per your instructions as shown below.

스크린샷 2023-09-06 오전 10 25 24

스크린샷 2023-09-06 오전 10 26 08

스크린샷 2023-09-06 오전 10 26 57

스크린샷 2023-09-06 오후 4 05 29

The platform I am currently testing is Gemini Lake DS920+. For the toolchain code, refer to the link below.

https://global.synologydownload.com/download/ToolChain/Synology%20NAS%20GPL%20Source/7.1.1-42962/geminilake/linux-4.4.x.txz

I am currently testing one Intel SSD disk on a Dell Perc H200 in IT mode. I used the disks add-on from TCRP FRIEND as per your instructions and The log results are as follows. (It looks like you used direct boot with ARPL, which you can switch to with TCRP, but for now you used a FRIEND kernel that goes through the GNU kernel.)

messages.txt

linuxrc.syno.log

The disk does not appear in /sys/block. What's the problem?

SynologyNAS> ll /sys/block drwxr-xr-x 2 root root 0 Sep 6 06:55 . dr-xr-xr-x 12 root root 0 Sep 6 06:55 .. lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram0 -> ../devices/virtual/block/ram0 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram1 -> ../devices/virtual/block/ram1 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram10 -> ../devices/virtual/block/ram10 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram11 -> ../devices/virtual/block/ram11 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram12 -> ../devices/virtual/block/ram12 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram13 -> ../devices/virtual/block/ram13 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram14 -> ../devices/virtual/block/ram14 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram15 -> ../devices/virtual/block/ram15 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram2 -> ../devices/virtual/block/ram2 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram3 -> ../devices/virtual/block/ram3 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram4 -> ../devices/virtual/block/ram4 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram5 -> ../devices/virtual/block/ram5 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram6 -> ../devices/virtual/block/ram6 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram7 -> ../devices/virtual/block/ram7 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram8 -> ../devices/virtual/block/ram8 lrwxrwxrwx 1 root root 0 Sep 6 06:55 ram9 -> ../devices/virtual/block/ram9

jim3ma commented 1 year ago

Your message.txt is missing some logs. There is an error in current message.txt: general protection fault for mpt3sas.

PeterSuh-Q3 commented 1 year ago

In case there was any improvement, I replaced the /drivers/scsi/mpt3sas subdirectory with geminilake's original old version 09.102.00.00 and compiled it. The results seem to be the same. A general protection fault occurs.

[ 50.618684] mpt3sas version 09.102.00.00 loaded [ 51.023252] general protection fault: 0000 [#1] SMP

I did some Googling, and this problem seems to be a very tricky and difficult situation to analyze. There are many cases of questions asked by Red Hat, etc., but it is difficult to find analyzed answers. I also don't have the ability to analyze the debugging logs shown in this area and see if there are any memory problems.

Should we stop here?

Attach the dmesg full log again.

dmesg.txt

jim3ma commented 1 year ago

The most possible reason is the struct memory aligning due to the different CONFIG_*.

The sd_probe will use the point from mpt3sas module, if the memory aligning is not same, sd_probe will access invalid memory address and panic.

[   51.023252] general protection fault: 0000 [#1] SMP
[   51.028243] Modules linked in: mpt3sas(OE+) raid_class(OE) scsi_transport_sas(OE) e1000e(OE) button(OE) fb fbdev i2c_algo_bit usb_storage xhci_pci xhci_hcd us
bcore usb_common
[   51.044018] CPU: 2 PID: 5154 Comm: modprobe Tainted: G           OE   4.4.302+ #64570
[   51.051836] Hardware name: Gigabyte Technology Co., Ltd. Z87N-WIFI/Z87N-WIFI, BIOS F6 08/12/2014
[   51.060606] task: ffff88040aa7a700 ti: ffff8804056d4000 task.ti: ffff8804056d4000
[   51.068078] RIP: 0010:[<ffffffff813efb8d>]  [<ffffffff813efb8d>] syno_libata_info_enum.constprop.0+0x6d/0x100
[   51.078000] RSP: 0018:ffff8804056d7390  EFLAGS: 00010246
[   51.083305] RAX: 0000000000000000 RBX: ffff880407508000 RCX: 0000000000000000
[   51.090429] RDX: 0000000000000076 RSI: 0000000000000000 RDI: ffffffffa0594290
[   51.097553] RBP: ffff8804056d75b0 R08: 000000000001d760 R09: ffffffff812d4fcc
[   51.104678] R10: ffff88040b003540 R11: 0000000000000000 R12: ffffffffa0594290
[   51.111802] R13: ffff88040750878d R14: 7065725f6873756c R15: ffff880407ba400c
[   51.118926] FS:  00007fc6b3551740(0000) GS:ffff88041d300000(0000) knlGS:0000000000000000
[   51.127003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.132741] CR2: 00007f1cc015cad8 CR3: 00000004076cd000 CR4: 00000000001606f0
[   51.139865] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   51.146989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.154113] Stack:
[   51.156123]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.163576]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.171029]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   51.178484] Call Trace:
[   51.180930]  [<ffffffff813f0252>] sd_probe+0x632/0x670
[   51.186069]  [<ffffffff8106b7e6>] ? blocking_notifier_call_chain+0x86/0x90
[   51.192934]  [<ffffffffa00006c2>] sd_probe_shim+0x52/0x80 [redpill]
[   51.199200]  [<ffffffff813b17cb>] driver_probe_device+0x19b/0x280
[   51.205291]  [<ffffffff813b19b0>] __device_attach_driver+0x80/0xa0
[   51.211461]  [<ffffffff813b1930>] ? __driver_attach+0x80/0x80
[   51.217200]  [<ffffffff813af504>] bus_for_each_drv+0x64/0xa0
[   51.222858]  [<ffffffff813b13a2>] __device_attach+0xa2/0x120
[   51.228509]  [<ffffffff813b1a0e>] device_initial_probe+0xe/0x10
[   51.234419]  [<ffffffff813b0acd>] bus_probe_device+0x8d/0xa0
[   51.240069]  [<ffffffff813ae4df>] device_add+0x3ff/0x620
[   51.245377]  [<ffffffff813e3058>] scsi_sysfs_add_sdev+0x88/0x280
[   51.251381]  [<ffffffff813e058b>] scsi_probe_and_add_lun+0xdcb/0x10b0
[   51.257812]  [<ffffffff813e0bd3>] __scsi_scan_target+0xa3/0x5a0
[   51.263722]  [<ffffffff813ae95c>] ? device_create+0x3c/0x40
[   51.269288]  [<ffffffff813bc2a7>] ? __pm_runtime_resume+0x47/0x60
[   51.275380]  [<ffffffff813e1185>] scsi_scan_target+0xb5/0xc0
[   51.281031]  [<ffffffffa055d1d7>] sas_rphy_add+0x107/0x150 [scsi_transport_sas]
[   51.288336]  [<ffffffffa057c095>] mpt3sas_transport_port_add+0x255/0x910 [mpt3sas]
[   51.295894]  [<ffffffff810943fa>] ? vprintk_default+0x1a/0x20
[   51.301632]  [<ffffffff810a2363>] ? del_timer_sync+0x43/0x50
[   51.307286]  [<ffffffffa0585370>] scsih_scan_finished.cold+0x214/0x252 [mpt3sas]
[   51.314675]  [<ffffffff813e142f>] do_scsi_scan_host+0x6f/0xa0
[   51.320411]  [<ffffffff813e15cd>] scsi_scan_host+0x16d/0x190
[   51.326065]  [<ffffffffa0579c9a>] _scsih_probe+0x42a/0x560 [mpt3sas]
[   51.332418]  [<ffffffff8131a260>] pci_device_probe+0x90/0xf0
[   51.338076]  [<ffffffff813b17cb>] driver_probe_device+0x19b/0x280
[   51.344160]  [<ffffffff813b1929>] __driver_attach+0x79/0x80
[   51.349722]  [<ffffffff813b18b0>] ? driver_probe_device+0x280/0x280
[   51.355980]  [<ffffffff813af469>] bus_for_each_dev+0x69/0xa0
[   51.361632]  [<ffffffff813b1109>] driver_attach+0x19/0x20
[   51.367021]  [<ffffffff813b0d56>] bus_add_driver+0x116/0x1d0
[   51.372673]  [<ffffffffa059b000>] ? 0xffffffffa059b000
[   51.377805]  [<ffffffff813b216a>] driver_register+0x8a/0xe0
[   51.383368]  [<ffffffff81318d21>] __pci_register_driver+0x41/0x50
[   51.389455]  [<ffffffffa059b0c1>] _mpt3sas_init+0xc1/0xd1 [mpt3sas]
[   51.395718]  [<ffffffff81000347>] do_one_initcall+0x87/0x130
[   51.401369]  [<ffffffff810bac8b>] do_init_module+0x5b/0x200
[   51.406934]  [<ffffffff810bcc0a>] load_module+0x1d9a/0x2280
[   51.412497]  [<ffffffff810b9660>] ? symbol_put_addr+0x40/0x40
[   51.418236]  [<ffffffff8118120c>] ? kernel_read+0x3c/0x50
[   51.423635]  [<ffffffff810bd2a3>] SYSC_finit_module+0x73/0x90
[   51.429371]  [<ffffffff810bd2d9>] SyS_finit_module+0x9/0x10
[   51.434936]  [<ffffffff8158464a>] entry_SYSCALL_64_fastpath+0x1e/0x93
[   51.441365] Code: 00 00 4c 89 e7 49 89 f5 e8 21 f9 02 00 84 c0 75 0b 41 c7 85 34 04 00 00 01 00 00 00 
4d 8b b4 24 c0 52 00 00 4c 8d ab 8d 07 00 00 <49> 8b 86
88 00 00 00 48 85 c0 74 13 48 8b 30 48 c7 c7 5c 3a 74
[   51.461316] RIP  [<ffffffff813efb8d>] syno_libata_info_enum.constprop.0+0x6d/0x100
[   51.468891]  RSP <ffff8804056d7390>
[   51.472383] ---[ end trace 20daaee2ec75c3f1 ]---

You can disassemble vmlinux and find the code in sd_probe. The message points that when reach general protection fault, the cpu is executing code <49> 8b 86.

PeterSuh-Q3 commented 1 year ago

https://github.com/PeterSuh-Q3/arpl-modules/blame/main/src/4.x/drivers/scsi/sd.c#L4666

In that line of code: CONFIG_SYNO_MULTIPATH_NATIVE_SAS_DEVICE_PREFIX Do you think the definition is wrong?

jim3ma commented 1 year ago

https://github.com/PeterSuh-Q3/arpl-modules/blame/main/src/4.x/drivers/scsi/sd.c#L4666

In that line of code: CONFIG_SYNO_MULTIPATH_NATIVE_SAS_DEVICE_PREFIX Do you think the definition is wrong?

I'm not clear.

The scsi/sd.c is built in vmlinux. You should make your HBA modules to satisfy sd_probe of scsi/sd.c. There are many macro MY_ABC_HERE and MY_DEF_HERE which is original CONFIG_SYNO_xxx. You should not enable all of MY_ABC_HERE and MY_DEF_HERE when compile mpt3sas.

PeterSuh-Q3 commented 1 year ago

And I've never done disassembly before, so I'll give it a try.

objcopy -O binary -R .note -R .comment -S vmlinux vmlinux.bin

objdump -D -b binary -m i386:vmlinux.bin

jim3ma commented 1 year ago

And I've never done disassembly before, so I'll give it a try.

objcopy -O binary -R .note -R .comment -S vmlinux vmlinux.bin

objdump -D -b binary -m i386:vmlinux.bin

target=/path/to/work
# bzImage-to-vmlinux.sh is in arpl
/path/to/arpl/files/board/arpl/overlayfs/opt/arpl/bzImage-to-vmlinux.sh \
  "$target/zImage" \
  "$target/vmlinux"

# https://github.com/marin-m/vmlinux-to-elf
/path/to/vmlinux-to-elf/vmlinux-to-elf "$target/vmlinux" "$target/vmlinux.elf"

After converted, you can use IDA Pro or other disassemble tools to analyze vmlinux.elf.

PeterSuh-Q3 commented 1 year ago

The last compiled mpt3sas.ko is I recompiled by switching to MY_DEF_HERE instead of MY_ABC_HERE.

https://github.com/PeterSuh-Q3/arpl-modules/commit/b7b4acd11c7fabe26d390dfb52d1904f11bcc966

The reason for the change is because I think SYNO_DISK_SAS should be used in sd_probe.

I installed ida 8.3 free and opened the elf file as shown below. Do you think I can analyze it? ^^

vmlinux.elf.zip

스크린샷 2023-09-07 오후 8 49 36

PeterSuh-Q3 commented 1 year ago

I compared it with the dmesg log of ds918+, which is operating normally.

스크린샷 2023-09-08 오전 12 16 41

jim3ma commented 1 year ago

I recompiled by switching to MY_DEF_HERE instead of MY_ABC_HERE.

I don't think it works.

You should analyze some struct size which sd_probe used, like:

    struct device
    struct scsi_device
    struct scsi_disk
    struct gendisk
  1. print sizeof()
  2. disassemble vmlinux and search code which allocate struct with size provided.
  3. confirm same size