akobel / linux-thinkpad-x1-tablet

2 stars 0 forks source link

Camera support #1

Open akobel opened 2 years ago

akobel commented 2 years ago

Continued from https://github.com/linux-surface/linux-surface/issues/91, where similar work has been done for the camera of Surface Go 2; let's hope that migrating that effort to the X1 tablet is feasible...

(more precisely, coming over from this comment)

martin9959 commented 2 years ago

Let's try the PLL too; can you sudo i2ctransfer -f -y 2 w2@0x4d 0x09 0x13? Note that on Windows it's 0x11, but that seems to be a mistake, as it's making a setting that the datasheet calls out as "Reserved".

Changes nothing for me, with or without the previous changes.

Same here.

Failing that, can we dump the contents of the sensor's registers on Windows and Linux so we can compare them too?

Can do so on Linux, I will prepare that I think. On Windows, I2cTestTool fails to read from I2C4 where the camera is located, at least without further changes to the DSDT. Not surprising. I tried to blindly change I2C2 to I2C4 for the entry of the proxy device, but was greeted by a bluescreen once again; will try to get further...

As far as Linux is concerned, would that mean something like i2cdump -f 3 0x21? That would give me

     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00    ...?............
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
30: 20 00 00 03 00 00 00 05 08 00 00 00 00 00 00 00     ..?...??.......
40: 00 02 00 00 00 00 00 00 07 00 00 00 00 00 00 00    .?......?.......
50: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00    ........?.......
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................

(i2cdump -f 3 0x10 gives exactly the same result).

martin9959 commented 2 years ago

Note that 0x09 seems to get set to 0x03 when qcam is started, no matter whether it was 0x00, 0x11 or 0x13 before. When qcam stops, it is set to 0x00.

djrscally commented 2 years ago

Let's try the PLL too; can you sudo i2ctransfer -f -y 2 w2@0x4d 0x09 0x13? Note that on Windows it's 0x11, but that seems to be a mistake, as it's making a setting that the datasheet calls out as "Reserved".

Changes nothing for me, with or without the previous changes.

Same here.

Failing that, can we dump the contents of the sensor's registers on Windows and Linux so we can compare them too?

Can do so on Linux, I will prepare that I think. On Windows, I2cTestTool fails to read from I2C4 where the camera is located, at least without further changes to the DSDT. Not surprising. I tried to blindly change I2C2 to I2C4 for the entry of the proxy device, but was greeted by a bluescreen once again; will try to get further...

As far as Linux is concerned, would that mean something like i2cdump -f 3 0x21? That would give me

     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00    ...?............
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
30: 20 00 00 03 00 00 00 05 08 00 00 00 00 00 00 00     ..?...??.......
40: 00 02 00 00 00 00 00 00 07 00 00 00 00 00 00 00    .?......?.......
50: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00    ........?.......
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................

(i2cdump -f 3 0x10 gives exactly the same result).

This format doesn't work for the sensor, as the register addresses are two bytes rather than one; is there a flag for i2cdump to extend the format?

martin9959 commented 2 years ago

As far as Linux is concerned, would that mean something like i2cdump -f 3 0x21?

This format doesn't work for the sensor, as the register addresses are two bytes rather than one; is there a flag for i2cdump to extend the format?

That could be i2cdump -f 3 0x21 w, is that better?:

     0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
00: 0000 0000 0000 0101 0000 0000 0000 0000 
08: 0000 0000 0000 0000 0000 0000 0000 0000 
10: 0000 0000 0000 0000 0000 0000 0000 0000 
18: 0000 0000 0000 0000 0000 0000 0000 0000 
20: 0000 0000 0000 0000 0000 0000 0000 0000 
28: 0000 0000 0000 0000 0000 0000 0000 0000 
30: 0000 0000 0000 0028 0000 0000 0000 4600 
38: 0000 0000 00f8 0000 0000 0000 0000 0000 
40: 0000 0000 0000 0000 0000 0000 0000 0000 
48: 0000 0000 0000 0000 0000 0000 0000 0000 
50: 0000 0000 0000 0000 0000 0000 1211 0404 
58: 8080 0480 0000 0000 0000 0000 0000 0000 
60: 0000 0000 0000 0000 0000 0000 0000 0000 
68: 0000 0000 0000 0000 0000 0000 0000 0000 
70: 0000 4b3a 0000 000e 0000 0000 0000 0000 
78: 0000 0000 0000 0000 0000 0000 0000 0000 
80: 0000 0000 0000 0000 0000 0000 0000 0000 
88: 0000 0000 0000 0000 0000 0000 0000 0000 
90: 0000 0000 0000 0000 0000 0000 0000 0000 
98: 0000 0000 0000 0000 0000 0000 0000 0000 
a0: 0000 0000 0000 0000 0000 0000 0000 0000 
a8: 0000 0000 0000 0000 0000 0000 0000 0000 
b0: 0000 0000 0000 0000 0000 0000 0000 0000 
b8: 0000 0000 0000 0000 0000 0000 0000 0000 
c0: 0000 0000 0000 0000 0000 0000 0000 0000 
c8: 0000 0000 0000 0000 0000 0000 0000 0000 
d0: 0000 0000 0000 0000 0000 0000 0000 0000 
d8: 0000 0000 0000 0000 0000 0000 0000 0000 
e0: 0000 0000 0000 0000 0000 0000 0000 0000 
e8: 0000 0000 0000 0000 0000 0000 0000 0000 
f0: 0000 0000 0000 0000 0000 0000 0000 0000 
f8: 0000 0000 0000 0000 0000 0000 0000 0000 
akobel commented 2 years ago

The registers where we (might) know the meaning are between 0x0100 and about 0x6000. i2cdump seems only to be able to access 256 bytes at once, plus the word-style reading seems fishy (at least, I get different output for the PMIC between word- and byte-wise reads). With i2ctransfer, you can reach the higher start addresses and read more at once (8k being the limit according to the manpage).

So I did four runs of i2ctransfer -y -f 8 w2@0x10 $start 0x00 r8192 with addresses $start in 0x00, 0x20, 0x40, 0x60 while qcam was running. This should be a superset of all "interesting" parts; see ov2740.hex.gz (concatenated and converted to raw data).

Windows is still on my to-do list; didn't do another attempt at the DSDT yet, life intervened.

martin9959 commented 2 years ago

The registers where we (might) know the meaning are between 0x0100 and about 0x6000. i2cdump seems only to be able to access 256 bytes at once, plus the word-style reading seems fishy (at least, I get different output for the PMIC between word- and byte-wise reads). With i2ctransfer, you can reach the higher start addresses and read more at once (8k being the limit according to the manpage).

So I did four runs of i2ctransfer -y -f 8 w2@0x10 $start 0x00 r8192 with addresses $start in 0x00, 0x20, 0x40, 0x60 while qcam was running. This should be a superset of all "interesting" parts; see ov2740.hex.gz (concatenated and converted to raw data).

Hm, ok - this is the result of:

i2ctransfer -y -f 3 w2@0x10 0x00 0x00 r8192 > i2ctransfer.txt
i2ctransfer -y -f 3 w2@0x10 0x20 0x00 r8192 >> i2ctransfer.txt
i2ctransfer -y -f 3 w2@0x10 0x40 0x00 r8192 >> i2ctransfer.txt
i2ctransfer -y -f 3 w2@0x10 0x60 0x00 r8192 >> i2ctransfer.txt

I don't quite know how to convert it to raw data, but maybe i helps in that form as well.

akobel commented 2 years ago

Here in hex format; only minor differences between the Tablet Gen1 and Gen2 dumps: something which looks like a serial number or something similar human-readable, and some few bytes in unknown, undescribed registers, AFAICS.

akobel commented 2 years ago

News from the Windows front:

First, note to future me: iasl seems to be picky about whitespace (or perhaps line endings?), but shy on throwing errors or warnings, much less meaningful ones. Also, disable optimizations with iasl -oa; it's also called "compatibility mode", but apparently everything else is "incompatibility mode" for our subject.

Second, couldn't make it. The I2C buses in the DSDT aren't what Windows thinks where the devices are. OV2740 should be on \_SB.PCI0.I2C4 according to the DSDT, Windows says it's "BIOS device" \_SB.PCI0.LNK2. OV8858 is claimed to be on LNK0. So, hooking the proxy to I2C4, it didn't even show up. Moving it to LNK2 as follows, the Resource Hub Proxy device did appear (as \_SB.PCI0.LNK2.RHPX), but I2cTestTool sees nothing.

Scope (\_SB.PCI0.LNK2)
{
    Device(RHPX)
    {
        Name(_HID, "MSFT8000")
        Name(_CID, "MSFT8000")
        Name(_UID, 1)

        Name(_CRS, ResourceTemplate()
        {
            I2CSerialBus(0xFFFF, , 0, , "\\_SB.PCI0.LNK2", , ,) // in this case, we want to grant access to the {-I2C2-}{+LNK2+} bus
        })

        Name(_DSD, Package()
        {
            ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
            Package()
            {
                Package(2) { "bus-I2C-LNK2", Package() { 0 }}, // The 0 here is the index of the related resource in _CRS
            }
        })
    }
}

Does someone have a clue whether all three of Scope, I2CSerialBus and/or Package line need to be adjusted, and if so, to what? (In the original template, I2C2 was where LNK2 appears now.)

[edit] Okay, LNK is an "interrupt link", apparently. Doesn't sound like an I2C bus. But from Linux, we know that the cam is accessible via I2C, and (on my Tablet 2nd Gen) sits on \_SB_.PCI0.I2C4.CAM1, and the PMIC on \_SB_.PCI0.I2C2.PMIC. And the first modified DSDT (with I2C2 instead of LNK2 in the above snippet) worked perfectly for talking to the TPS68470. So my first attempt with Scope, I2CSerialBus, Package all saying I2C4 instead of LNK2 looks really plausible to me? Puzzled. [/edit]

martin9959 commented 2 years ago

[edit] Okay, LNK is an "interrupt link", apparently. Doesn't sound like an I2C bus. But from Linux, we know that the cam is accessible via I2C, and (on my Tablet 2nd Gen) sits on \_SB_.PCI0.I2C4.CAM1, and the PMIC on \_SB_.PCI0.I2C2.PMIC. And the first modified DSDT (with I2C2 instead of LNK2 in the above snippet) worked perfectly for talking to the TPS68470. So my first attempt with Scope, I2CSerialBus, Package all saying I2C4 instead of LNK2 looks really plausible to me? Puzzled. [/edit]

Are you sure I2C4 is the right one? In Windows, device manager -> system devices, I have 3 "Intel Serial IO I2C Host Controllers", 9D60, 9D62 and 9D63. In the "details" tab, 9D63 has the "dependency dependent" (sic) ACPI\INT3474\2 and the BIOS device name of \_SB.PCI0\I2C3. The camera sensor ov2740 is also in the list, with BIOS device name \_SB_PCI0.LNK2 and device instance path ACPI\INT3474\2.

So for me, this looks like the right one for my device could be I2C3, which would correspond to the Linux numbering (coincidence or not). I couldn't find the corresponding I2Cx for ov8858 though.

IIUC, the DSDT for your device says it's I2C4 (no such information in my DSDT), but if I'm not mistaken, it wouldn't be the first time the DSDT is wrong.

djrscally commented 2 years ago

Sorry not been around much lately folks - picked up a bug from my kids so been semi-floored by that.

OV2740 should be on _SB.PCI0.I2C4 according to the DSDT, Windows says it's "BIOS device" _SB.PCI0.LNK2. OV8858 is claimed to be on LNK0. So, hooking the proxy to I2C4, it didn't even show up. Moving it to LNK2 as follows, the Resource Hub Proxy device did appear (as _SB.PCI0.LNK2.RHPX), but I2cTestTool sees nothing.

_SB.PCI0.LNK2 is the ACPI path of the sensor device, not the I2C Bus it sits on. You have to rely on the dump you pulled, which calls out the OV2470 as being on I2C3 (both in the _CRS method which gives it as a string, and also in _DSM as a byte array)

akobel commented 2 years ago

Ah-ha. Yes, that worked. Here's two dumps on Windows, while the camera was running: ov2740-win.hex.gz ov2740-win2.hex.gz (Produced with I2cTestTool 0x10 I2C3 - with patched DSDT, of course - and write {00} + read 32768.)

Interestingly, there seems to be an offset of 9 (!) bytes to the dump created on Linux with i2ctransfer. I guess that the Linux dump is correct, as the chip ID 00 27 40 can be found there at 0x300a to 0x300c; in the Windows dump, it's at 0x3001 to 0x3003. Not sure why that happened, but I observed the same over several runs. Hence, another version attached which is shifted by 9 bytes: ov2740-win-shift9.hex.gz ov2740-win2-shift9.hex.gz

djrscally commented 2 years ago

@akobel nice one, thank you! No idea why it might be shifted by 9 bytes...that's super weird. I'm going to (without any investigation) chalk it up to the I2cTestTool being a bit junk and forget it for now :P Writing a port of i2ctransfer for Windows is on my to-do list, as that tool is sort of meh.

There are quite a few differences between linux and windows, but they all seem to fall into these categories:

  1. Read only (I.E. something in the operation of the camera is changing it)
  2. Undocumented (the datasheet doesn't say what it is)
  3. Gain/Exposure/Lens Correction/Defective Pixel Correction settings - these won't affect the operation of the camera

The only thing that's interesting there is the OTP registers. I don't know that those are likely to help us figure out why it's not streaming data for more than half a frame but a bit of investigation here might let us figure out how the default settings are made, which would be nice.

EDIT: Brilliantly the datasheets define a format to store register-value pairs in the OTP data for providing defaults...which it seems like Microsoft hasn't followed for the Go2's camera, so this is not as productive as hoped.

markum commented 2 years ago

Thank you for your work. Is there anything "for dummies who need more precise instructions" which I can do and would be helpful? I try to follow what you write and at the moment have the feeling contributing is not easy although I desperately would like to see this working.

akobel commented 2 years ago

Grrr. Tried to follow up by installing the probe driver as linked in https://github.com/akobel/linux-thinkpad-x1-tablet/issues/1#issuecomment-1073266532.

Setting up the device worked, it appears as "I2C/SPI Probe Controller Driver" among the System devices, and I can fire up traceview. If I disable and/or reenable this very device, I also see activity in the trace.

However, I couldn't successfully hook it up into neither PMIC (INT3472 on I2C2) nor OV2740 (INT3474 on I2C3). So I don't get any meaningful output. :(

My DSDT change: added

Scope (\_SB.PCI0.I2C2)
{
    Device(SPB1)
    {
        Name(_HID, "PROBE01")
        Name(_UID, 1)
            Method(_CRS, 0x0, NotSerialized)
        {
            Name (RBUF, ResourceTemplate ()
            {
                I2CSerialBus (0x4C, ControllerInitiated, 100000,AddressingMode7Bit, "\\_SB.PCI0.I2C2",,,,)
             // I2CSerialBus (0x07, ControllerInitiated, 100000,AddressingMode7Bit, "\\_SB.PCI0.I2C2",,,,) // didn't work either
            })
            Return(RBUF)
        }
    }
}

and changed "\\_SB.PCI0.I2C2" to "\\_SB.PCI0.SPB1" in

Scope (\_SB.PCI0.I2C2)
{
    Device (PMIC)
    {
        /* ... */
        Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
        {
            Name (SBUF, ResourceTemplate ()
            {
                I2cSerialBusV2 (0x004C, ControllerInitiated, 0x00061A80,
                    AddressingMode7Bit, "\\_SB.PCI0.SPB1", // from "\\_SB.PCI0.I2C2"
                    0x00, ResourceConsumer, , Exclusive,
                    )
            })
            Return (SBUF) /* \_SB_.PCI0.I2C2.PMIC._CRS.SBUF */
        }
        /* ... */
    }
}

Would this look about right?

akobel commented 2 years ago

Also, the guide for the probe driver asks to disable the target device before starting the trace, to get a log of the initialization. But I can't find a device node for the PMIC. Perhaps completely disabling "I2C Host Controller - 9D62" would do? But so far, also with this I only get output for the probe device itself.

Finally, even if I add a DEP entry in the PMIC device, no dependents are listed for the probe device (this "Dependency dependents" field doesn't even appear). So it somehow looks as if the DSDT modification for PMIC is entirely ignored. And probably the same for the sensor aka CAM1 - after all, the DSDT says something about bus I2C4, but it appears on I2C3. So IIUC, Windows grabs the assignment from somewhere else than the DSDT? And I would need to modify the "path" from "I2C2/I2C3" to "I2C2/I2C3 via SPB1" at this "somewhere else" location?

djrscally commented 2 years ago

@akobel

OK, should have forseen this sorry. The problem is the annoying way DSDT is working for you. We've only seen this on the Go2 so far; rather than devices being hardcoded in DSDT they're instead being configured at runtime by values loaded into memory by the BIOS, which makes the DSDT.dsl a confusing minefield to navigate. You're modifying Device (PMIC) which is a 100% sensible thing to do, because in a sane world that would be the bloody PMIC...but it isn't. If you check its _STA method you'll see:

            Method (_STA, 0, NotSerialized)  // _STA: Status
            {
                If ((SCSS == 0x01))
                {
                    Return (0x0F)
                }
                Else
                {
                    Return (0x00)
                }
            }

SCSS is a memory location into which the BIOS will have loaded some value...which isn't 0x01 and so that PMIC device is just disabled. If you check the dump of ACPI stuff we're interested in for the INT3472 entry, the ACPI path given is _SB_.PCI0.CLP0, so it's Device (CLP0) you need to edit.

Now the _CRS for that currently is this:

            Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
            {
                Local0 = Buffer (0x02)
                    {
                         0x79, 0x00                                       // y.
                    }
                ConcatenateResTemplate (Local0, IICB (C0IA, C0IB), Local2)
                Local0 = Local2
                Return (Local0)
            }

ConcatenateResTemplate() is an ACPI method; IICB() is a method defined in the DSDT which builds an I2CSerialBusV2 depending on the inputs, and C0IA and C0IB are locations in memory into which the BIOS has loaded some values. Now we can't edit those, which means we can't fixup the _CRS whilst it's using this method, so instead we have to comment out that entire _CRS and hardcode our own that works. Isn't this fun?

Fortunately it's not too hard because most of an I2CSerialBusV2 is boilerplate. Basically, replace it with this:

Method (_CRS, 0, NotSerialized)
{
    Name (SBUF, ResourceTemplate()
    {
        I2CSerialBusV2 (0x004D, ControllerInitiated, 0x00061A80,
            AddressingMode7Bit, "\\_SB.PCI0.SPB1",
            0x00, ResourceConsumer, , Exclusive,
        )
    })
    Return (SBUF)
}

For your SPB1 device, you've written the address in the I2cSerialBus as 0x4c - it should be 0x4d according to the DSDT dump and the i2ctransfer calls you were making earlier. Also, I'm not sure what the difference is but I made it an I2cSerialBusV2 instead of an I2CSerialBus...in case it doesn't work you could try switching to that - it should be identical to the above one except with the path set to "\_SB.PCI0.I2C"

Also, the guide for the probe driver asks to disable the target device before starting the trace, to get a log of the initialization. But I can't find a device node for the PMIC. Perhaps completely disabling "I2C Host Controller - 9D62" would do? But so far, also with this I only get output for the probe device itself.

I honestly can't remember what I did here...sorry, disabling the I2C2 controller would probably work and I'd happily try it, but I can't remember what exactly it was that I tried. I also can't find the PMIC as a standalone in device manager, but I can see the cameras - possibly I only ever tried those and not the PMIC. You could also try disabling the overarching Camera devices (I have "Intel(R) AVStream Camera 2500" for example) and see if that does it.

Finally, even if I add a DEP entry in the PMIC device, no dependents are listed for the probe device (this "Dependency dependents" field doesn't even appear). So it somehow looks as if the DSDT modification for PMIC is entirely ignored. And probably the same for the sensor aka CAM1 - after all, the DSDT says something about bus I2C4, but it appears on I2C3. So IIUC, Windows grabs the assignment from somewhere else than the DSDT? And I would need to modify the "path" from "I2C2/I2C3" to "I2C2/I2C3 via SPB1" at this "somewhere else" location?

This is just because the device isn't as you (entirely reasonably) expect. The PMIC is device CLP0, the cameras are LNK0 and LNK2

akobel commented 2 years ago

Well, what can you say, apart from: "thanks, that worked"? Folks like you are the proof that security by obscurity is not a reliable engineering practice... ;-)

Did a dump of the PMIC (which seems to be in the device list as "Intel(R) Control Logic", by the way; now that I know to look for CLP0) with the following actions:

seq #   time       action
================================================================================
    1   00:12:13   enable Intel(R) Serio IO I2C Host Controller - 9D62 (I2C2)
 6043   00:12:41   start Camera (front cam)
 7126   00:16:06   stop Camera
 7962   00:17:05   disable Intel(R) Control Logic
10857   00:18:01   enable Intel(R) Control Logic
16891   00:19:01   start Camera (front cam)
17974   00:20:01   switch to rear cam
19703   00:21:03   stop Camera

Won't try to parse this tonight. Just so much: at first glance, grepping for "write" indeed shows a trace of the PMIC register programming on startup, as I hoped. So this should give the order of operations for the power-up sequence (starting from line 6043).

I would expect that I can make a dump of the commands to the sensor with a similar approach, but that it'll be way more data with less clear meaning.

For the sake of it, here's the raw dump in traceview's internal format: LogSession_20220408_001020.etl.gz. You'll need the spbProbe.pdb from SimplePeripheralBusProbe 1.1 for interpretation, as mentioned in the guide.

djrscally commented 2 years ago

Well, what can you say, apart from: "thanks, that worked"? Folks like you are the proof that security by obscurity is not a reliable engineering practice... ;-)

Heh thanks; they certainly tried pretty hard.

For the sake of it, here's the raw dump in traceview's internal format: LogSession_20220408_001020.etl.gz. You'll need the spbProbe.pdb from SimplePeripheralBusProbe 1.1 for interpretation, as mentioned in the guide.

Nice thanks. I parsed it and stripped out the BS, to leave us with nothing but the IO: https://pastebin.com/AyDzXSWa

I'll start going through it and see if I can decipher what the trick is.

djrscally commented 2 years ago

OK so register 0x27 (which controls the GPIO out) gets values 0x00, 0x10, 0x30, 0x70, 0x50, 0x10, 0x00 in that order, at various points throughout the trace, which looks like this:

bit 7 6 5 4 3 2 1 0
0x00 0 0 0 0 0 0 0 0
0x10 0 0 0 1 0 0 0 0
0x30 0 0 1 1 0 0 0 0
0x70 0 1 1 1 0 0 0 0
0x50 0 1 0 1 0 0 0 0
0x10 0 0 0 1 0 0 0 0
0x00 0 0 0 0 0 0 0 0

It's the same lines 4, 5 and 6 being toggled that we settled on earlier, so that confirms those. They're toggled in that order during powerup...but line 5 is then shut off first. Odd ordering. It does look like they're all active the whole time streaming is running though; a bit difficult to be certain with that output I always find...but I'm pretty sure the change from 0x70 to 0x50 doesn't go ahead until streaming stops. Still...we could re-order the calls to gpiod_set_value() in the ov2740 driver's powerup() and powerdown() to match that format.

Might be worth checking we're doing the exact same thing on linux. Can you edit drivers/gpio/gpio-tps68470.c and add the following debug call to tps68470_gpio_set():

static void tps68470_gpio_set(struct gpio_chip *gc, unsigned int offset,
                int value)
{
    struct tps68470_gpio_data *tps68470_gpio = gpiochip_get_data(gc);
    struct regmap *regmap = tps68470_gpio->tps68470_regmap;
    unsigned int reg = TPS68470_REG_GPDO;

    if (offset >= TPS68470_N_REGULAR_GPIO) {
        reg = TPS68470_REG_SGPO;
        offset -= TPS68470_N_REGULAR_GPIO;
    }

    pr_info("%s(): reg=0x%02x, offset=0x%02x, val=0x%02x\n", __func__, reg, offset, value);

    regmap_update_bits(regmap, reg, BIT(offset), value ? BIT(offset) : 0);
}

And then check what happens in Linux? I'd leave it "streaming" for a good 10 seconds or so before turning off; it makes spotting the transition much easier when the timestamps are radically different. The output will be written to dmesg, you can grep for tps68470

EDIT:

Also;

which seems to be in the device list as "Intel(R) Control Logic", by the way; now that I know to look for CLP0

Do you know a way to search then? Or did you just manually check every entry?

martin9959 commented 2 years ago

Might be worth checking we're doing the exact same thing on linux. Can you edit drivers/gpio/gpio-tps68470.c and add the following debug call to tps68470_gpio_set():

static void tps68470_gpio_set(struct gpio_chip *gc, unsigned int offset,
              int value)
{
  struct tps68470_gpio_data *tps68470_gpio = gpiochip_get_data(gc);
  struct regmap *regmap = tps68470_gpio->tps68470_regmap;
  unsigned int reg = TPS68470_REG_GPDO;

  if (offset >= TPS68470_N_REGULAR_GPIO) {
      reg = TPS68470_REG_SGPO;
      offset -= TPS68470_N_REGULAR_GPIO;
  }

  pr_info("%s(): reg=0x%02x, offset=0x%02x, val=0x%02x\n", __func__, reg, offset, value);

  regmap_update_bits(regmap, reg, BIT(offset), value ? BIT(offset) : 0);
}

Here:

[   77.664499] tps68470_gpio_set(): reg=0x27, offset=0x05, val=0x00
[   77.665202] tps68470_gpio_set(): reg=0x27, offset=0x06, val=0x00
[   77.665428] tps68470_gpio_set(): reg=0x27, offset=0x04, val=0x00
[   77.669115] tps68470_gpio_set(): reg=0x27, offset=0x05, val=0x01
[   77.669394] tps68470_gpio_set(): reg=0x27, offset=0x06, val=0x01
[   77.669669] tps68470_gpio_set(): reg=0x27, offset=0x04, val=0x01
[   77.710746] ipu3-cio2 0000:00:14.3: payload length is 2725632, received 1540032
[   77.737140] use of bytesused == 0 is deprecated and will be removed in the future,
[   77.737145] use the actual size instead.
[   77.834246] ipu3-cio2 0000:00:14.3: CSI-2 receiver port 2: frame sync error
[   77.867019] ipu3-cio2 0000:00:14.3: payload length is 2725632, received 0

(...)

[   82.326603] ipu3-cio2 0000:00:14.3: CSI-2 receiver port 2: frame sync error
[   82.359147] ipu3-cio2 0000:00:14.3: payload length is 2725632, received 0
[   82.391891] ipu3-cio2 0000:00:14.3: payload length is 2725632, received 0
[   82.404718] tps68470_gpio_set(): reg=0x27, offset=0x05, val=0x00
[   82.405345] tps68470_gpio_set(): reg=0x27, offset=0x06, val=0x00
[   82.405620] tps68470_gpio_set(): reg=0x27, offset=0x04, val=0x00

Full dmesg

akobel commented 2 years ago

Same here. Changing the order to match Windows didn't improve anything for the camera output, too, with or without the previous tweaks mentioned in https://github.com/akobel/linux-thinkpad-x1-tablet/issues/1#issuecomment-1079972023 (via i2ctransfer).

So my uneducated bet would be that the power-up sequence is okay, but either some clock speed negotiation fails, or the driver misses some initialization or setting for the sensor. Plan: I'll try to make the I2C probe driver working for the sensor, too, and snoop a trace.

akobel commented 2 years ago

Here we go.

Apparently, upon device detection nothing is written at all (sounds reasonable); 17:14:02 is when I started the Camera app. I can't really tell accurately when it started streaming; but I assume that the writes to 0x3208 (seq. # 6167) mark the end of the init phase. Accordingly, # 28796 would be the start of the shutdown phase.

akobel commented 2 years ago

ov2740-setup.c.txt contains the same in a format friendlier to a C compiler. I plan to give it a shot and put this into mode_1932x1092_regs in ov2740.c and see what happens; not now, though. One notable difference from a brief manual comparison: around 0x3800, there's different settings between Windows and the current entries in mode_1932x1092_regs. These relate to timing control or image windowing control according to the datasheet, AFAIU; so it's not totally unreasonable that those could affect payload lengths?

akobel commented 2 years ago

Also;

which seems to be in the device list as "Intel(R) Control Logic", by the way; now that I know to look for CLP0

Do you know a way to search then? Or did you just manually check every entry?

@djrscally Sorry, just manually checking and a bit of luck to hit the right one before getting too bored. But I bet there's some way to get it via PowerShell (driverquery seems to be a powerful command?), and now that I think of it, there's also a list not just of dependencies, but also dependents, IIRC. That could be a means to trace it manually quicker than just looking at each entry.

akobel commented 2 years ago

not now, though.

Oh well, whatever. 20220409_19h33m43s_grim

Not quite there yet, but hey. Details will follow - essentially, copy-paste everything up to # 6167 into mode_1932x1092_regs and see what happens.

martin9959 commented 2 years ago

Not quite there yet, but hey. Details will follow - essentially, copy-paste everything up to # 6167 into mode_1932x1092_regs and see what happens.

Wonderful! :-)

I tried to only change values in ov2740.c without adding anything, and got to the same result with this short diff:

--- kernel/drivers/media/i2c/ov2740-orig.c  2022-04-10 12:04:04.992196102 +0200
+++ kernel/drivers/media/i2c/ov2740.c   2022-04-10 11:50:20.141698417 +0200
@@ -153 +153 @@
-   {0x3083, 0xB4},
+   {0x3083, 0xb4},
@@ -213 +213 @@
-   {0x3801, 0x02},
+   {0x3801, 0x00},
@@ -215 +215 @@
-   {0x3803, 0x0a},
+   {0x3803, 0x00},
@@ -217 +217 @@
-   {0x3805, 0x8e},
+   {0x3805, 0x8f},
@@ -219 +219 @@
-   {0x3807, 0x4e},
+   {0x3807, 0x47},
@@ -224,2 +224,2 @@
-   {0x380c, 0x04},
-   {0x380d, 0x38},
+   {0x380c, 0x08},
+   {0x380d, 0x70},
@@ -227 +227 @@
-   {0x380f, 0x60},
+   {0x380f, 0x56},
djrscally commented 2 years ago

Ah nice job guys! That's great. We've got the ov2740 datasheet so just need to lookup what those changed values might be and figure out what's wrong...

Regarding the colouring, you might try horizontally flipping the sensor output. I don't remember the exact command but it'll be something like v4l2-ctl -d /dev/v4l-subdevN -l to list the controls, and v4l2-ctl -d /dev/v4l-subdevN -c control=value to set them.

On Sun, 10 Apr 2022, 11:14 martin9959, @.***> wrote:

Not quite there yet, but hey. Details will follow - essentially, copy-paste everything up to # 6167 into mode_1932x1092_regs and see what happens.

Wonderful! :-)

I tried to only change values in ov2740.c without adding anything, and got to the same result with this short diff:

--- kernel/drivers/media/i2c/ov2740-orig.c 2022-04-10 12:04:04.992196102 +0200 +++ kernel/drivers/media/i2c/ov2740.c 2022-04-10 11:50:20.141698417 +0200 @@ -153 +153 @@

  • {0x3083, 0xB4},
  • {0x3083, 0xb4}, @@ -213 +213 @@
  • {0x3801, 0x02},
  • {0x3801, 0x00}, @@ -215 +215 @@
  • {0x3803, 0x0a},
  • {0x3803, 0x00}, @@ -217 +217 @@
  • {0x3805, 0x8e},
  • {0x3805, 0x8f}, @@ -219 +219 @@
  • {0x3807, 0x4e},
  • {0x3807, 0x47}, @@ -224,2 +224,2 @@
  • {0x380c, 0x04},
  • {0x380d, 0x38},
  • {0x380c, 0x08},
  • {0x380d, 0x70}, @@ -227 +227 @@
  • {0x380f, 0x60},
  • {0x380f, 0x56},

— Reply to this email directly, view it on GitHub https://github.com/akobel/linux-thinkpad-x1-tablet/issues/1#issuecomment-1094237581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDBE22N2TRLMG3N5TAG2XTVEKSYJANCNFSM5PPEUDAQ . You are receiving this because you were mentioned.Message ID: @.***>

martin9959 commented 2 years ago

Ah nice job guys! That's great. We've got the ov2740 datasheet so just need to lookup what those changed values might be and figure out what's wrong... Regarding the colouring, you might try horizontally flipping the sensor output. I don't remember the exact command but it'll be something like v4l2-ctl -d /dev/v4l-subdevN -l to list the controls, and v4l2-ctl -d /dev/v4l-subdevN -c control=value to set them.

I don't seem to have the horizontal_flip / vertical_flip controls:

root@x1t:~# v4l2-ctl -l -d /dev/v4l-subdev4 

User Controls

                       exposure 0x00980911 (int)    : min=4 max=2178 step=1 default=2178 value=1243

Image Source Controls

              vertical_blanking 0x009e0901 (int)    : min=28 max=31675 step=1 default=1094 value=1094
            horizontal_blanking 0x009e0902 (int)    : min=228 max=228 step=1 default=228 value=228 flags=read-only
                  analogue_gain 0x009e0903 (int)    : min=128 max=1983 step=1 default=128 value=128

Image Processing Controls

                 link_frequency 0x009f0901 (intmenu): min=0 max=0 default=0 value=0 (360000000 0x15752a00) flags=read-only
                     pixel_rate 0x009f0902 (int64)  : min=0 max=144000000 step=1 default=144000000 value=144000000 flags=read-only
                   test_pattern 0x009f0903 (menu)   : min=0 max=4 default=0 value=0 (Disabled)
                   digital_gain 0x009f0905 (int)    : min=1024 max=4095 step=1 default=1024 value=1024
djrscally commented 2 years ago

Oh! Well that's interesting. I can add it for you, I'm travelling at the moment though so might take a couple of days. If someone wants to try it though you can copy from the ov5693 driver - it's pretty straightforward

On Sun, 10 Apr 2022, 17:20 martin9959, @.***> wrote:

Ah nice job guys! That's great. We've got the ov2740 datasheet so just need to lookup what those changed values might be and figure out what's wrong... Regarding the colouring, you might try horizontally flipping the sensor output. I don't remember the exact command but it'll be something like v4l2-ctl -d /dev/v4l-subdevN -l to list the controls, and v4l2-ctl -d /dev/v4l-subdevN -c control=value to set them.

I don't seem to have the horizontal_flip / vertical_flip controls:

@.***:~# v4l2-ctl -l -d /dev/v4l-subdev4

User Controls

                   exposure 0x00980911 (int)    : min=4 max=2178 step=1 default=2178 value=1243

Image Source Controls

          vertical_blanking 0x009e0901 (int)    : min=28 max=31675 step=1 default=1094 value=1094
        horizontal_blanking 0x009e0902 (int)    : min=228 max=228 step=1 default=228 value=228 flags=read-only
              analogue_gain 0x009e0903 (int)    : min=128 max=1983 step=1 default=128 value=128

Image Processing Controls

             link_frequency 0x009f0901 (intmenu): min=0 max=0 default=0 value=0 (360000000 0x15752a00) flags=read-only
                 pixel_rate 0x009f0902 (int64)  : min=0 max=144000000 step=1 default=144000000 value=144000000 flags=read-only
               test_pattern 0x009f0903 (menu)   : min=0 max=4 default=0 value=0 (Disabled)
               digital_gain 0x009f0905 (int)    : min=1024 max=4095 step=1 default=1024 value=1024

— Reply to this email directly, view it on GitHub https://github.com/akobel/linux-thinkpad-x1-tablet/issues/1#issuecomment-1094306659, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDBE25ANLU52MVRS6TMBRTVEL5WHANCNFSM5PPEUDAQ . You are receiving this because you were mentioned.Message ID: @.***>

djrscally commented 2 years ago

Not quite there yet, but hey. Details will follow - essentially, copy-paste everything up to # 6167 into mode_1932x1092_regs and see what happens.

Wonderful! :-)

I tried to only change values in ov2740.c without adding anything, and got to the same result with this short diff:

--- kernel/drivers/media/i2c/ov2740-orig.c    2022-04-10 12:04:04.992196102 +0200
+++ kernel/drivers/media/i2c/ov2740.c 2022-04-10 11:50:20.141698417 +0200
@@ -153 +153 @@
- {0x3083, 0xB4},
+ {0x3083, 0xb4},
@@ -213 +213 @@
- {0x3801, 0x02},
+ {0x3801, 0x00},
@@ -215 +215 @@
- {0x3803, 0x0a},
+ {0x3803, 0x00},
@@ -217 +217 @@
- {0x3805, 0x8e},
+ {0x3805, 0x8f},
@@ -219 +219 @@
- {0x3807, 0x4e},
+ {0x3807, 0x47},
@@ -224,2 +224,2 @@
- {0x380c, 0x04},
- {0x380d, 0x38},
+ {0x380c, 0x08},
+ {0x380d, 0x70},
@@ -227 +227 @@
- {0x380f, 0x60},
+ {0x380f, 0x56},

So the top one is undocumented, but the other changes are nothing but the crop rectangle (which starts at 0,0 in windows but is centered in the pixel array in the Linux driver), plus the timing HTS and VTS registers. Now the crop will affect the payload size that's sent to the CIO2 device, but errors here have not in my experience resulted in a black screen and 0 payload size. You usually get a distorted image and non zero payloads. It's probably worth sanity checking the crop/output/timing registers in the linux driver to make sure they all are valid settings (so output <= crop <= timing) - 0x3800 to 0x380f, plus the binning controls

martin9959 commented 2 years ago

OK, I experimented a bit more and it seems to me that this one change makes the difference:

@@ -217 +217 @@
-   {0x3805, 0x8e},
+   {0x3805, 0x8f},

All the rest can stay unchanged. I don't have the datasheet, so I don't know what it means, but you'll surely be able to make sense of it.

Note that I also left this change

@@ -153 +153 @@
-   {0x3083, 0xB4},
+   {0x3083, 0xb4},

because I consider it to be a typo (capital B in a hex address doesn't seem right to me). But I don't think it's relevant (I don't know whether these hex addresses are case sensitive).

kbingham commented 2 years ago

because I consider it to be a typo (capital B in a hex address doesn't seem right to me). But I don't think it's relevant (I don't know whether these hex addresses are case sensitive).

No, they're not case sensitive - so there is no change to the code there between 0xB4 and 0xb4. It's purely a visual code change, not a functional change.

akobel commented 2 years ago

Ah nice job guys! That's great. We've got the ov2740 datasheet so just need to lookup what those changed values might be and figure out what's wrong... Regarding the colouring, you might try horizontally flipping the sensor output. I don't remember the exact command but it'll be something like v4l2-ctl -d /dev/v4l-subdevN -l to list the controls, and v4l2-ctl -d /dev/v4l-subdevN -c control=value to set them.

For a more permanent solution:

--- a/drivers/media/i2c/ov2740.c
+++ b/drivers/media/i2c/ov2740.c
@@ -231,8 +231,8 @@ static const struct ov2740_reg mode_1932x1092_regs[] = {
        {0x3813, 0x04},
        {0x3814, 0x01},
        {0x3815, 0x01},
-       {0x3820, 0x80},
-       {0x3821, 0x46},
+       {0x3820, 0x04},
+       {0x3821, 0x08},
        {0x3822, 0x84},
        {0x3829, 0x00},
        {0x382a, 0x01},

Couldn't make proper white balancing work yet, but it's a significant improvement.

Misc findings:

This diff also increases the frame rate from appx. 15 fps to 25-28 in 1280x720, and about 22-25 in 1920x1080. Still some room to the promised 60 fps.

Toggling 0x3820 between 0x04 and 0x05 seems to affect HDR; AFAICS, 0x04 is with automatic HDR enabled, 0x05 with HDR disabled (which would be opposite to what the datasheet claims, so not totally sure about that; just guessing).

Also, I realized that the strange noise I heard seems to be a very bad case of coil whine. It's the same on Windows, I just didn't notice before. It becomes significantly better with the higher frame rate and full resolution (qcam -s width=1920,height=1080), and again better when CPU load increases; apparently, the "idling" load with camera active on low fps hits exactly a sweet spot where the coil whine on my machine is most pronounced.

martin9959 commented 2 years ago

For a more permanent solution:

--- a/drivers/media/i2c/ov2740.c
+++ b/drivers/media/i2c/ov2740.c
@@ -231,8 +231,8 @@ static const struct ov2740_reg mode_1932x1092_regs[] = {
        {0x3813, 0x04},
        {0x3814, 0x01},
        {0x3815, 0x01},
-       {0x3820, 0x80},
-       {0x3821, 0x46},
+       {0x3820, 0x04},
+       {0x3821, 0x08},
        {0x3822, 0x84},
        {0x3829, 0x00},
        {0x382a, 0x01},

Couldn't make proper white balancing work yet, but it's a significant improvement.

That didn't make any change for me, the pinkish tint is still the same, and the fps value is unchanged (it was already at the values you mentioned).

Maybe I missed some changes, now I have the following:

martin@T61:~/Downloads/X1/kernel$ diff  drivers/media/i2c/ov2740.c drivers/media/i2c/ov2740-orig.c 
153c153
<   {0x3083, 0xb4},
---
>   {0x3083, 0xB4},
217c217
<   {0x3805, 0x8f},
---
>   {0x3805, 0x8e},
234,235c234,235
<   {0x3820, 0x04},
<   {0x3821, 0x08},
---
>   {0x3820, 0x80},
>   {0x3821, 0x46},
djrscally commented 2 years ago

Just working on the flip controls (plus some info to let libcamera figure out the orientation). There's actually two possible bits to set to do the flipping, which are called out as "array" and "digital", whatever that means. Any chance you can try setting bits 1 and 2 of register 0x3820 separately and then together to see what affect it has?

djrscally commented 2 years ago

Alternatively I just pushed two new commits to add the controls to the driver; https://github.com/djrscally/kernel/tree/thinkpad-x1

They should have the same effect as the changes to 0x3820 and 0x3821 above (libcamera will use the controls to make those same changes in effect) - if anyone could test that would be really useful.

Sorry, i didn't get round to verifying the Crop/Output/HTS/VTS validity, but perhaps tomorrow :)

djrscally commented 2 years ago

OK So, here's the register values for the crop, timing and output:

    {0x3800, 0x00},
    {0x3801, 0x02},
    {0x3802, 0x00},
    {0x3803, 0x0a},
    {0x3804, 0x07},
    {0x3805, 0x8e},
    {0x3806, 0x04},
    {0x3807, 0x4e},
    {0x3808, 0x07},
    {0x3809, 0x88},
    {0x380a, 0x04},
    {0x380b, 0x40},
    {0x380c, 0x04},
    {0x380d, 0x38},
    {0x380e, 0x04},
    {0x380f, 0x60},

That sets...

H Crop Start=2 H Crop End=1934 (making a crop rectangle width of 1932)

V Crop Start=10 V Crop End=1102 (making a crop rectangle height of 1092)

X Output Size=1928 Y Output Size=1088

HTS=1080 VTS=1120

The HTS is odd there and I don't like it. It should be >= the crop width. The output size being less than the crop size is probably wrong. The sensor could output a smaller size than the selected crop rectangle if the chip had an integral scaler like the ov5693, but this one doesn't. Alternatively some pixel binning could be in play but that would only generally allow a reduction of the height/width by half of the crop rectangle and, in any case, neither horizontal nor vertical binning is configured. This set of changes:

@@ -153 +153 @@
-   {0x3083, 0xB4},
+   {0x3083, 0xb4},
@@ -213 +213 @@
-   {0x3801, 0x02},
+   {0x3801, 0x00},
@@ -215 +215 @@
-   {0x3803, 0x0a},
+   {0x3803, 0x00},
@@ -217 +217 @@
-   {0x3805, 0x8e},
+   {0x3805, 0x8f},
@@ -219 +219 @@
-   {0x3807, 0x4e},
+   {0x3807, 0x47},
@@ -224,2 +224,2 @@
-   {0x380c, 0x04},
-   {0x380d, 0x38},
+   {0x380c, 0x08},
+   {0x380d, 0x70},
@@ -227 +227 @@
-   {0x380f, 0x60},
+   {0x380f, 0x56},

Have the effect of resizing the crop rectangle to 1935x1095, the HTS to 2160 and the VTS to 1110...that fixes the HTS problem but not the output size one...and according to this:

OK, I experimented a bit more and it seems to me that this one change makes the difference:

@@ -217 +217 @@
- {0x3805, 0x8e},
+ {0x3805, 0x8f},

The problems are solved simply by changing the horizontal crop to stop at 1935 instead of 1934, which makes no sense at all. I'm going to add another commit that fixes the output size problems as I see them, and hopefully that will do the trick. I'll let you know when ready.

EDIT: Oh shit, looks like I caused most of these issues, or at least changed the values from those that the above diff is setting them back to. I do still think that commit of mine looks right though...the mode calls for 1932x1092 and the pixel array is 1936x1112, I make that H start 2, H end 1934, Y start 10, Y end 1102 which is all right afaics, so I think let's still try the fixup to output size

Second Edit: OK, made that commit too: https://github.com/djrscally/kernel/commits/thinkpad-x1 - if someone could test when they get a moment please, that would be great :)

martin9959 commented 2 years ago

Here are my test results:

For these changes:

@@ -153 +153 @@
- {0x3083, 0xB4},
+ {0x3083, 0xb4},
@@ -213 +213 @@
- {0x3801, 0x02},
+ {0x3801, 0x00},
@@ -215 +215 @@
- {0x3803, 0x0a},
+ {0x3803, 0x00},
@@ -217 +217 @@
- {0x3805, 0x8e},
+ {0x3805, 0x8f},
@@ -219 +219 @@
- {0x3807, 0x4e},
+ {0x3807, 0x47},
@@ -224,2 +224,2 @@
- {0x380c, 0x04},
- {0x380d, 0x38},
+ {0x380c, 0x08},
+ {0x380d, 0x70},
@@ -227 +227 @@
- {0x380f, 0x60},
+ {0x380f, 0x56},

I have found the following effects (additional to the difference made by setting 0x3805 to 0x8f, which has already been discussed):

a)

@@ -224,2 +224,2 @@
- {0x380c, 0x04},
- {0x380d, 0x38},
+ {0x380c, 0x08},
+ {0x380d, 0x70},

brings the fps rate from around 30/22 fps (with/without -r gles) down to around 15 fps (with or without -r gles).

b)

@@ -219 +219 @@
- {0x3807, 0x4e},
+ {0x3807, 0x47},

takes away the colour change: With 0x4e, when setting vertical_flip=1, the image changes from a pink to a green tint (in addition to being upside down, obviously). With 0x47, vertical_flip=1 flips the image upside down, but does not change its colour.

Second Edit: OK, made that commit too: https://github.com/djrscally/kernel/commits/thinkpad-x1 - if someone could test when they get a moment please, that would be great :)

This:

-   {0x380b, 0x40},
+   {0x380b, 0x44},

has the positive effect that the "payload length is x, received y" error message goes away (for the first time, AFAICT).

On the other hand,

-   {0x3809, 0x88},
+   {0x3809, 0x8c},

has the negative effect that horizontal image flipping now longer works - when setting horizontal_flip=1, the image hangs instead of flipping (no error message, the fps counter continues to show normal values, and the image comes back to life when I set horizontal_flip back to 0).

Setting both 0x380b and 0x3809 to the values mentioned above has the positive effect as mentioned, but neither horizontal nor vertical flipping work.

djrscally commented 2 years ago

b)

@@ -219 +219 @@
-   {0x3807, 0x4e},
+   {0x3807, 0x47},

takes away the colour change: With 0x4e, when setting vertical_flip=1, the image changes from a pink to a green tint (in addition to being upside down, obviously). With 0x47, vertical_flip=1 flips the image upside down, but does not change its colour.

This is probably because it results in an odd-sized output; 1095 rows instead of 1094/1096. I think libcamera expects an even size to preserve the bayer order.

Second Edit: OK, made that commit too: https://github.com/djrscally/kernel/commits/thinkpad-x1 - if someone could test when they get a moment please, that would be great :)

This:

- {0x380b, 0x40},
+ {0x380b, 0x44},

has the positive effect that the "payload length is x, received y" error message goes away (for the first time, AFAICT).

OK, that's good then. And just to clarify; with only the commits from that tree and no other additions on top, qcam was streaming the picture? As originally that wasn't working if I remember correctly.

On the other hand,

- {0x3809, 0x88},
+ {0x3809, 0x8c},

has the negative effect that horizontal image flipping now longer works - when setting horizontal_flip=1, the image hangs instead of flipping (no error message, the fps counter continues to show normal values, and the image comes back to life when I set horizontal_flip back to 0).

Setting both 0x380b and 0x3809 to the values mentioned above has the positive effect as mentioned, but neither horizontal nor vertical flipping work.

I think this is more likely to be some screwup with the flip controls I added than those registers, as they shouldn't have that effect...let me look again

martin9959 commented 2 years ago

OK, that's good then. And just to clarify; with only the commits from that tree and no other additions on top, qcam was streaming the picture? As originally that wasn't working if I remember correctly.

I did a git pull and added the changes mentioned in your post as well as this patch, which is still needed to cater for the differences between Gen 1 and Gen 2 devices, if I'm not mistaken.

djrscally commented 2 years ago

OK, that's good then. And just to clarify; with only the commits from that tree and no other additions on top, qcam was streaming the picture? As originally that wasn't working if I remember correctly.

I did a git pull and added the changes mentioned in your post as well as this patch, which is still needed to cater for the differences between Gen 1 and Gen 2 devices, if I'm not mistaken.

Ah-ha, right. That set of changes is what you'd already tried sorry. Could we try with just my tree plus that patch (which may well be needed yes)

martin9959 commented 2 years ago

OK, that's good then. And just to clarify; with only the commits from that tree and no other additions on top, qcam was streaming the picture? As originally that wasn't working if I remember correctly.

I did a git pull and added the changes mentioned in your post as well as this patch, which is still needed to cater for the differences between Gen 1 and Gen 2 devices, if I'm not mistaken.

Ah-ha, right. That set of changes is what you'd already tried sorry. Could we try with just my tree plus that patch (which may well be needed yes)

Sure! That brings me back to the beginning, the flickering half-image like here.

djrscally commented 2 years ago

Nuts, ok. And what about if you just add this:

OK, I experimented a bit more and it seems to me that this one change makes the difference:


@@ -217 +217 @@
- {0x3805, 0x8e},
+ {0x3805, 0x8f},
martin9959 commented 2 years ago

Nuts, ok. And what about if you just add this:

OK, I experimented a bit more and it seems to me that this one change makes the difference:

@@ -217 +217 @@
-   {0x3805, 0x8e},
+   {0x3805, 0x8f},

Then I get:

akobel commented 2 years ago

Same here.

By the way: does anyone know whether/how it's possible to test the effects of changes to ov2740.c without a reboot? My current approach is build module, install module, reboot; essentially because I can't modprobe -r ov2740 due to the module being in use. Which seems fine; after all, it was used to probe the sensor upon initialization of the TPS68470, right? But after that, it idles until the sensor is actually powered, so I would hope that I can either replace the driver behind the scenes, or perhaps disable the sensor, patch the module, and the reenable. (If you give a positive answer, I'll totally hate myself for not asking way earlier. ;-))

@djrscally I'm not sure how trustworthy the ("preliminary") datasheet is. E.g., I see different defaults and slightly different (but obviously related) meanings listed between the "image windowing controls" and the "timing control" registers, Or for 0x3821, one table says that bits 0 and 5 are meant for binning, another table says it's bits 0 and 7, just to name some. Do you have any idea how similar the OV sensors are, and whether it might make sense to consult an officially released datasheet (if accessible) for a similar sensor? [OTOH, e.g. for OV8865 you'll easily find a datasheet without "preliminary" in it via Google; but it even shows 3 different defaults for 0x3821; or they are meant to be XORed...]

djrscally commented 2 years ago

By the way: does anyone know whether/how it's possible to test the effects of changes to ov2740.c without a reboot? My current approach is build module, install module, reboot; essentially because I can't modprobe -r ov2740 due to the module being in use. Which seems fine; after all, it was used to probe the sensor upon initialization of the TPS68470, right? But after that, it idles until the sensor is actually powered, so I would hope that I can either replace the driver behind the scenes, or perhaps disable the sensor, patch the module, and the reenable. (If you give a positive answer, I'll totally hate myself for not asking way earlier. ;-))

Yeah I think you're stuck with that method, that's what I do too.

@djrscally I'm not sure how trustworthy the ("preliminary") datasheet is. E.g., I see different defaults and slightly different (but obviously related) meanings listed between the "image windowing controls" and the "timing control" registers, Or for 0x3821, one table says that bits 0 and 5 are meant for binning, another table says it's bits 0 and 7, just to name some. Do you have any idea how similar the OV sensors are, and whether it might make sense to consult an officially released datasheet (if accessible) for a similar sensor? [OTOH, e.g. for OV8865 you'll easily find a datasheet without "preliminary" in it via Google; but it even shows 3 different defaults for 0x3821; or they are meant to be XORed...]

I have the sheets for a few sensors (including the 8865), and honestly they're all janky. OmniVision may be good at some things, but documentation is absolutely not one of those things.

Anyway, it's crazy to me that that one change has that affect. My last guess is that the windowing offsets aren't zero (they are in the drivers I'm familiar with) so maybe that's throwing out the timings. Let's try changing those to 0x00 too, can we try my tree with:

0x3805 set to 0x8e 0x3810 set to 0x00 0x3811 set to 0x00 0x3812 set to 0x00 0x3813 set to 0x00 0x3821 set to 0x08

That sets the output back to what it should be, and the offsets to 0 (as they are in the ov5693 I wrote, and the ov8865 and ov5648 which are the other two I'm most familiar with). It also turns horizontal flip off by default; I've just realised the mode is setting it on and the control should handle that now instead.

kbingham commented 2 years ago

run 'lsmod' with the module loaded, when you're not using the camear - and it should show you what other things are blocking it if they are.

For a sensor, I can't imagine anything is directly using it - so a log of your lsmod would be helpful to confirm. You ... can ... force a module unload I believe (if the option is supported in your kernel) - but thar be dragons ...

djrscally commented 2 years ago

run 'lsmod' with the module loaded, when you're not using the camear - and it should show you what other things are blocking it if they are.

For a sensor, I can't imagine anything is directly using it - so a log of your lsmod would be helpful to confirm. You ... can ... force a module unload I believe (if the option is supported in your kernel) - but thar be dragons ...

I get nothing listed:

ov8865            32768    1
ov5693            20480    1

But sudo modprobe -r ov8865 gives me modprobe: FATAL: Module ov8865 is in use

kbingham commented 2 years ago

from man modprobe:

   -r, --remove
       This option causes modprobe to remove rather than insert a module. If the modules it depends on are also unused, modprobe will try to remove them too. Unlike insertion, more than one module can be specified on the command line
       (it does not make sense to specify module parameters when removing modules).

       There is usually no reason to remove modules, but some buggy modules require it. Your distribution kernel may not have been built to support removal of modules at all.

Are there 'any' modules listed with a use count of '0'? I wouldn't be surprised if the 'kernel option to disable module removal' always takes a module reference to block removal.

Can you check your kernel config settings for modules?


cat /boot/config-5.9.0-050900rc7-generic | grep MODULE
...
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
...
djrscally commented 2 years ago

from man modprobe:

   -r, --remove
       This option causes modprobe to remove rather than insert a module. If the modules it depends on are also unused, modprobe will try to remove them too. Unlike insertion, more than one module can be specified on the command line
       (it does not make sense to specify module parameters when removing modules).

       There is usually no reason to remove modules, but some buggy modules require it. Your distribution kernel may not have been built to support removal of modules at all.

Are there 'any' modules listed with a use count of '0'? I wouldn't be surprised if the 'kernel option to disable module removal' always takes a module reference to block removal.

There are a bunch with 0 use count yes...including the ipu3_imgu and ipu3_cio2 modules.

Can you check your kernel config settings for modules?


cat /boot/config-5.9.0-050900rc7-generic | grep MODULE
...
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
...

I have CONFIG_MODULE_UNLOAD=y and CONFIG_MODULE_FORCE_UNLOAD is not set