I2C clock frequency incorrect on FT2232H

jfranz-argo commented 4 years ago

Hello,

I noticed some strange behavior when trying to get dual I2C masters working on an FT2232 Mini-Module. On the second execution of the following code, the clock frequency is much higher than first execution, causing a NACK.

#!/usr/bin/python3

from pyftdi.i2c import I2cController
from pyftdi.usbtools import UsbTools
import time

i2c_1 = I2cController()
i2c_2 = I2cController()
timeout = time.time() + 5.0
while time.time() < timeout:
    try:
        i2c_1.configure("ftdi://ftdi:2232h/1", frequency=100000)
        i2c_2.configure("ftdi://ftdi:2232h/2", frequency=100000)
        break
    except:
        UsbTools.flush_cache()
        time.sleep(0.05)
        continue

port = i2c_2.get_port(0x76)

print(port.read_from(0x00, 2))
print(port.read_from(0x00, 2))

Console output:

username@workstation$ python3 ftditest.py
bytearray(b'8\x00')
bytearray(b'8\x00')
username@workstation$ python3 ftditest.py
Traceback (most recent call last):
  File "ftditest.py", line 31, in <module>
    print(port.read_from(0x00, 2))
  File "/home/username/.local/lib/python3.6/site-packages/pyftdi/i2c.py", line 148, in read_from
    out=self._make_buffer(regaddr), readlen=readlen, relax=relax)
  File "/home/username/.local/lib/python3.6/site-packages/pyftdi/i2c.py", line 752, in exchange
    self._do_prolog(i2caddress)
  File "/home/username/.local/lib/python3.6/site-packages/pyftdi/i2c.py", line 1009, in _do_prolog
    raise I2cNackError('NACK from slave')
pyftdi.i2c.I2cNackError: NACK from slave

This doesn't seem to occur when only using a single port on the FT2232H. Additionally, I've noticed that in some situations the script works every other time I run it, although I haven't been able to identify what conditions cause that.

eblot commented 4 years ago

I should say I never tried this setup. I will try to reproduce it.

eblot commented 4 years ago

For now, I'm unable to reproduce this issue: always 100 KHz on I2C2 (BD0, BD1) on FT232H MiniModule.

eblot commented 4 years ago

What does happen on the Saleae capture before the invalid clock? SDA and SCL are low, i.e. the start sequence does not appear on the capture.

eblot commented 4 years ago

jfranz-argo commented 4 years ago

Here is a screenshot of the start sequence:

Here is the first thing that happens on the bus (what triggers the capture), but it doesn't appear to be I2C related:

jfranz-argo commented 4 years ago

I just tried it on Mac, and it works fine. The issue I was having was with Linux (Ubuntu 18.04).

eblot commented 4 years ago

For the record I have the same initialisation glitches, I'm not sure it can be avoided (as the FTDI is reset @ startup).

I do use macOS. I did not tried yet on Linux.

I wonder whether the Linux kernel could change the settings somehow... Have you tried to unload/blacklist the serial driver (ftdi_sio.ko I think)?

jfranz-argo commented 4 years ago

I just tried unloading the ftdi_sio.ko kernel module, but it didn't solve the problem.

jfranz-argo commented 4 years ago

When I add one more print(port.read_from(0x00, 2)), the issue goes away. However, adding a 4th causes the issue to reappear. It seems like there needs to be an even/odd (unsure which) number of commands in order to cause this issue.

ElmarJongerius commented 4 years ago

I have seen this issue also on Win7 with v0.40.6. If I can test (and help to fix this) please let me know; as this also gave me some problems.

eblot commented 4 years ago

If you have anyway to capture the actual USB transfers between the host and the FTDI device, it guess it would help to understand if a faulty command is sent to the FTDI device.

jfranz-argo commented 4 years ago

Do you have a recommended tool for this? I can procure a hardware or software tool to try and capture the USB transfers. I confirmed the commands sent to pyusb appear to be the same.

eblot commented 4 years ago

HW USB analysers are quite expensive (~ $1000 for USB2 HS). SW USB analysers are cheaper, some are free, but they are usually OS-specific. There are many for Windows, a few for Linux (including a Wireshark-based one), I do not know one on macOS.

A nice way would be to capture the USB pass-through bridge on a VM (VmWare, Parallels, VirtualBox or even better: QEMU), so the capture tool would be OS-agnostic... I did not find one for now, but I have looked for long.

I guess that if the problem can be reproduced with a Linux host, the most straightforward way is to use Wireshark + USB module to do the capture. (?)

jfranz-argo commented 4 years ago

I'm currently working on wireshark captures, and I'm running into issues with differences between OSes as you mentioned. It does not seem to be different from the pyusb debugging outputs. I will upload something when I'm confident that I've captured something worth sharing.

Is there an API reference for the MPSSE engine I can use to decode the data coming in/going out?

eblot commented 4 years ago

Is there an API reference for the MPSSE engine I can use to decode the data coming in/going out?

The reference from FTDI chip would be Application Note 108 "Command Processor for MPSSE and MCU Host Bus Emulation Modes" available from FTDI web site.

You can also use the integrated MPSSE decoder, albeit incomplete, e.g.:

# spi_cs_direct branch
$ PYTHONPATH=. FTDI_DEBUG=on FTDI_LOGLEVEL=DEBUG pyftdi/tests/spi.py
Using MPSSE tracer
> 8b860000
[Command: 8B: ENABLE_CLK_DIV5]
 Enable clock divisor /5
[Command: 86: SET_TCK_DIVISOR]
 Set frequency 6.000MHZ
< 3260
Bus frequency: 6.000000 MHz (error: +0.0 %)
> 80080b820000
[Command: 80: SET_BITS_LOW]
 Set gpio[7:0]  08 ____1_00
[Command: 82: SET_BITS_HIGH]
 Set gpio[15:8] 00 ________
> 85
[Command: 85: LOOPBACK_END]
 Disable loopback
< 3260
> 97
[Command: 97: DISABLE_CLK_ADAPTIVE]
 Disable adaptive clock
> 87
[Command: 87: SEND_IMMEDIATE]
 Send immediate
> 8b860500
[Command: 8B: ENABLE_CLK_DIV5]
 Enable clock divisor /5
[Command: 86: SET_TCK_DIVISOR]
 Set frequency 1.000MHZ
< 3260
Bus frequency: 1000.000 KHz (error: +0.0 %)
> 800c0b80040b80040b80040b800c0b800c0b
[Command: 80: SET_BITS_LOW]
 Set gpio[7:0]  0c ____1_00
[Command: 80: SET_BITS_LOW]
 Set gpio[7:0]  04 ____0_00

Two conditions to get those traces:

instantiate Ftdi with debug argument I2cController.configure(..., debug=True)
set the pyftdi.mpsse.tracer log level to DEBUG, e.g. FtdiLogger.set_level(logging.DEBUG) to get traces as HEX bytes and MPSSE decoded protocol.

jfranz-argo commented 4 years ago

Another datapoint I forgot to include: Following the instructions in this stack overflow thread resets the USB port and gets the device working again https://askubuntu.com/questions/645/how-do-you-reset-a-usb-device-from-the-command-line

Not sure if it helps, but figured I'd share the info in case anyone else is having the problem.

eblot commented 4 years ago

Note that v0.44.0 fixes a couple of MPSSE parsing issues, you might want to upgrade if you use the tracer to help debugging.

jfranz-argo commented 4 years ago

I found something possibly strange, although it might just be that I don't understand Python well enough. I'm using a class that contains the pyftdi objects, and its destructor calls i2c.terminate(). I added print(self.usb_dev) to the ftdi.close() function in ftdi.py in a local copy of the pyftdi repo. When my destructor runs at the end of the program (due to the object going out of scope), the program hangs when it tries to print out self.usb_dev. However, when I call del(my_obj) at the end of my program, thereby invoking the destructor before the object goes out of scope, the print statement prints out fine and the issue does not manifest.

Below is an example of what I'm referring to:

#!/usr/bin/python3

from pyftdi.i2c import I2cController
from pyftdi.usbtools import UsbTools
import time

class MyClass:
    def __init__(self):
        self.i2c = I2cController()
        timeout = time.time() + 5.0
        while time.time() < timeout:
            try:
                self.i2c.configure("ftdi://ftdi:2232h/2", frequency=100000)
                break
            except:
                UsbTools.flush_cache()
                time.sleep(0.05)
                continue
        self.port = self.i2c.get_port(0x76)
    def __del__(self):
        print("destructor called")
        self.i2c.terminate()

my_obj = MyClass()
print(my_obj.port.read_from(0x00, 2))
print(my_obj.port.read_from(0x00, 2))

# Uncommenting this line eliminates the problem
# del(my_obj)

When the last line is uncommented, the I2C frequency is always correct.

Here's a snippet of the edit I made to close() in ftdi.py:

    def close(self) -> None:
        """Close the FTDI interface/port."""
        print(self.usb_dev)
        if self.usb_dev:
            dev = self.usb_dev
            ...

Without this change, the issue (NACKs due to wrong frequency) still occur, but the program does not hang.

eblot commented 4 years ago

Well...

TL;DR: do not :-)

Long version:

Never ever use a destructor in Python except if you have a very, very specific and verified need to do so. __del__ is not the equivalent of a C++ destructor. You cannot be sure when it will be called (because of the GC). In other words, the order in which it is called is undefined, at least from a user perpective.

This piece of advice is especially true with wrapped native library: pyftdi calls pyusb which relies on libusb. One way or another, this is likely to create a call sequence where you destroy/cleanup the underlying memory blocks in libusb, then resume calling some function on libusb using these memory blocks (i.e. access-after-free), as you cannot manage when pyusb calls libusb vs. when your __del__ method is actually called.

In the early versions of PyFtdi, I fought against these daemons for a while, before realizing it was a very bad idea ™ to rely on __del__ :-)

So I would recommend to stay away from using __del__ to invoke anything from PyFtdi (or PyUsb). If you want to perform some final cleanup when the interpreter is about to close, use atexit.register. If you want to close the PyFtdi instance before the interpreter ends, add an explicit method and call it, but do not use __del__. Even if it avoid failing at some point, it would still be a hidden flaw that would bite another day.

jfranz-argo commented 4 years ago

Thanks for the advice! It seems like I misunderstood when to use __del__, although maybe I still don't understand. I'll keep reading to see what I can learn.

It seems like calling i2c.terminate() or ftdi.close() solves my problem. Is there any way to guarantee that it will run before the program exits, even in a case when the program does not exit normally? I'd prefer to avoid having to wrap everything in a try except finally. Some thoughts below:

Implement my class with a context manager, and use __enter__ and __exit__ to manage the pyftdi objects. I'm not a huge fan of this, but it could work for my use case.
Change something in pyftdi to ensure that the pyftdi.close() function is called. Not sure what the best way to do this would be. Maybe atexit.register?
Use atexit.register in my class, as you suggested. I'll likely try this first, as it seems like a simple fix. I'll try this out and get back to you.

Please let me know what you think would be the best long term solution to this issue. I likely won't have time to keep digging into the root cause of this issue. Due to the OS-specific nature of it, I'm fairly convinced that it's an issue with something below pyftdi, such as pyusb or libusb. Or worse, undocumented silicon errata.

eblot commented 4 years ago

atexit.register is the easiest way.

However, I'm not sure why you need to call close/terminate. I'm missing the point: there's a bug if you do not call it?

jfranz-argo commented 4 years ago

Yes, when I do not call close/terminate, sometimes the I2C frequency is incorrect causing NACKs on the bus. This no longer occurs when I call close/terminate before exiting the program.

eblot commented 4 years ago

Ok, let me try to sum up the issue:

Issue

The I2C invalid clock frequency occurs at random when:

two I2C master are created on the same FTDI HW
the Python interpreter exits without proper release of the FTDI interface

Questions

Does this only occur when both I2C masters are instantiated from the same interpreter (same process), or does it also occurs when two Python interpreters are started, each managing a single I2C master?
Can this bug be reproduced with a single I2C master, and the interpreter exiting w/o proper FTDI cleanup?

jfranz-argo commented 4 years ago

Here's my summary, including some of the findings above:

Symptom: I2C bus frequency is incorrect

• Only found so far on 2232H, only on 2nd interface (e.g. "ftdi://ftdi:2232h/2") • Does not happen on first execution after plugging device in • Issuing a USBDEVFS_RESET ioctl behaves the same as unplugging/plugging in device • Issue does not always manifest, suspect that it is dependent on number of commands/messages sent to FTDI device. • Have not seen issue after adding close/terminate before end of program

To answer your questions:

No, also manifests when only 2nd I2C interface is instantiated
Yes, see https://github.com/eblot/pyftdi/issues/159#issuecomment-589907089

jfranz-argo commented 4 years ago

Given that it only happens with the 2232H, and that it seems to be related to the number of commands sent (See https://github.com/eblot/pyftdi/issues/159#issuecomment-566696153), is it possible that an 8-bit command is sent for the 16-bit device, putting the device in an unknown state?

eblot commented 4 years ago

No, also manifests when only 2nd I2C interface is instantiated

Yes, see #159 (comment)

Ok, so the issue can be simplified/summed as:

Closing PyFTDI on the second MPSSE port w/o proper termination can lead to incorrect clock on next access.

It can be reproduced with a unique I2C master, as long as the second port is used. Am I right?

eblot commented 4 years ago

Given that it only happens with the 2232H, and that it seems to be related to the number of commands sent (See #159 (comment)), is it possible that an 8-bit command is sent for the 16-bit device, putting the device in an unknown state?

AFAICT, there is no 8-bit/16-bit concept with FTDI. All commands are 8-bit, and I do not recall any 16-bit alignment constraint somewhere in the specs.

jfranz-argo commented 4 years ago

Your simplified summary is correct.

It can be reproduced with a unique I2C master, as long as the second port is used. Am I right?

Unique == same part number, different unit? Yes, this occurs on all 2232H devices I've used. I have not observed this behavior on a 4232H, but I will test and let you know if it doesn't work.

eblot commented 4 years ago

I've added a new API, Ftdi.reset() in 98a808d18bd0143577a5362a0e2b81b970080def, which optionally performs a full USB-level reset.

It is not automatically called, as it resets all interfaces at once, but it might help to recover from a FTDI chip internal unstable state.

jfranz-argo commented 4 years ago

Thanks, I'll try this out.

eblot / pyftdi

I2C clock frequency incorrect on FT2232H #159

Issue

Questions