melexis / mlx90640-library

MLX90640 library functions
Apache License 2.0
234 stars 184 forks source link

Extreme timing sensitivity #7

Closed Gadgetoid closed 3 years ago

Gadgetoid commented 6 years ago

I seem to be running into an issue where any slight variance in the amount of time I delay between reads will cause the camera to output erroneous readings.

I am trying to use step mode to get data for subpage 0 and 1, which I can get to run successfully, but only under certain circumstances which I can't seem to pin down or recreate.

The datasheet mentions flags that should be polled for step mode:

2.By polling the flags (configuration IO at 0x92, bits 8 and 9)

But 0x92 is not a valid register address- these are 16-bit registers- and there's no "configuration IO" to my knowledge?

I have also tried polling the status register (0x8000) and checking bit b3 (0x08) but despite this being the "A new data is available in RAM" register, a read right after it's set seldom results in any valid sensor data.

Is the datasheet simply wrong, or am I missing something here? This library doesn't seem to support reading both subpages, and your own MLX90640_GetFrameData appears to start a measurement up to 9 times and return an error if it exceeds 4 attempts?

https://github.com/melexis/mlx90640-library/blob/388d00527b0f337377877414d62c33b1ae59bb75/functions/MLX90640_API.cpp#L44-L101

Looking at GetFrameData since the first thing it does is check the status register for a "data ready" condition, then the first call to GetFrameData will always fail in "step mode" since no reading will have been triggered?

Gadgetoid commented 6 years ago

Here's the code I'm running, set up to function on a Raspberry Pi- https://github.com/pimoroni/mlx90640-library

Note the setting of alternate pages to get the full data every 2 loops:

https://github.com/pimoroni/mlx90640-library/blob/aa70685380507c51307ea745bcf18e17739ce791/test.cpp#L49-L54

If I change this delay, the sensor will give invalid data: https://github.com/pimoroni/mlx90640-library/blob/aa70685380507c51307ea745bcf18e17739ce791/test.cpp#L85

I made a separate attempt (not currently in this repo because it's a broken mess) to set the page, trigger a new reading, wait for the interrupt (by polling the register), read the data, and clear the interrupt but can't seem to get it to work.

slavysis commented 6 years ago

Hi,

First of all thanks for finding an error in the datasheet - indeed the 0x92 is not a valid register address. This is already fixed in the datasheet and will be uploaded soon. The library does support reading both subpages. Actually the only thing one should do if this information is needed is to know which subpage the frame data is for. This can be achieved by using the MLX90640_GetSubPageNumber function as you already do in your code. The MLX90640 device is automatically alternating the two subpages and the read-out data is also being alternated accordingly. The driver takes into account for which subpage the data is and calculates only the appropriate object temperatures. You can see this in the sample data file. Basically, if one needs the whole frame, the only thing that should be taken care for is proper communication with the device and proper memory allocation and addressing when calling the different driver functions. I believe that the issues you have are caused by a slow data readout. As you pointed out the driver function that reads the frame data does up to 5 tries and if all of those fail then an error is returned. This is done because the whole frame data must be dumped before a new measurement is started. If the readout time is almost the same as the refresh period, this might happen very seldom if at all. There are two ways to fix this:

  1. Increase the i2c frequency in order to decrease the read-out time - note that the device supports FM+ (1MHz), but the EEPROM should be read at maximum 400KHz so one would have to switch the i2c clock
  2. Decrease the MLX90640 refresh rate in order to have more time for the data readout. I think that the best approach to sort this out is:
  3. Use continuous mode (the device is still alternating the sub pages automatically)
  4. Set the refresh rate of the MLX90640 to 0.5Hz - this would allow a readout time of 2seconds.
  5. Check the i2c clock signal frequency - at 1MHz, a full frame read out would take a bit less than 15us, so if the clock is not really slow you should not have problems with invalid data. If the above routine works, you could start optimizing the refresh rate and the i2c speed so that you get the desired performance.
adibacco commented 6 years ago

I also tried to use step mode but I get also really weird data for the frame (I2C running at 400 kHz). Step mode is really needed if you have a device that is powered by a battery.

It would be really nice if you could implement an API like the one below, that reads both the sub pages and put them into frameData array. Afterwards the user should call a slightly modified MLX90640_CalculateTo that takes an additional par indicating the sub page without looking into frameData[833].

Unfortunately the code of my MLX90640_GetCompleteFrameData_StepMode is not working properly.

int MLX90640_GetCompleteFrameData_StepMode(uint8_t slaveAddr, uint16_t *frameData) { uint16_t dataReady = 1; uint16_t controlRegister1; volatile uint16_t statusRegister; int error = 1; uint8_t cnt = 0;

dataReady = 0;
uint16_t ctrl_reg;
MLX90640_I2CRead(MLX90640_I2C_ADDR, 0x800D, 1, &ctrl_reg);
// Enable step mode
MLX90640_I2CWrite(MLX90640_I2C_ADDR, 0x800D, ctrl_reg | 0x0002);

HAL_Delay(10);

// Trigger acquisition (subpage 0)  

error = MLX90640_I2CWrite(slaveAddr, 0x8000, 0x0030); if(error == -1) { return error; }

HAL_Delay(1000); // Probably this is too much??

error = MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);

// Trigger acquisition (subpage 1)

error = MLX90640_I2CWrite(slaveAddr, 0x8000, 0x0030); if(error == -1) { return error; }

HAL_Delay(1000);

error = MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);

error = MLX90640_I2CRead_Bulk(slaveAddr, 0x0400, 832, frameData);

if(error != 0) { return error; }

error = MLX90640_I2CRead(slaveAddr, 0x800D, 1, &controlRegister1);
frameData[832] = controlRegister1;
error = MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);

frameData[833] = statusRegister & 0x0001; 
// Both subPages acquired
if(error != 0)
{
    return error;
}

return frameData[833];

}

Gadgetoid commented 6 years ago

I still don't understand why step mode isn't working how I would expect- no data overwrite should happen until I clear the interrupt, surely?

From what I can glean, the step mode process should be roughly:

  1. Specify which subpage I want (bits 6 downto 4 in 0x800D - although in practise only bit 4 is used)
  2. Trigger a reading (set bit 5 of the status register to 1 and bit 3 to 0)
  3. Wait for the data ready flag (status register bit 3 or 0x0008, I'm checking this flag every 1ms)
  4. Read the data (832 bytes from 0x0400)
  5. Clear the interrupt flag (status register bit 3 or 0x0008 - datasheet says "must be reset by customer")

In code, this is something like:

int MLX90640_CheckInterrupt(uint8_t slaveAddr)
{
    uint16_t statusRegister;
    MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);
    return (statusRegister & 0b1000) > 0;
}

void MLX90640_StartMeasurement(uint8_t slaveAddr, uint8_t subPage)
{
    uint16_t controlRegister1;
    uint16_t statusRegister;
    MLX90640_I2CRead(slaveAddr, 0x800D, 1, &controlRegister1);
    controlRegister1 &= 0b1111111111101111;
    controlRegister1 |= subPage << 4;
    MLX90640_I2CWrite(slaveAddr, 0x800D, controlRegister1);
    MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);
    statusRegister &= 0b1111111111110111; // Clear b3: new data available in RAM
    statusRegister |= 0b0000000000110000; // Set b5: start of measurement
                                          // Set b4: enable RAM overwrite
    MLX90640_I2CWrite(slaveAddr, 0x8000, statusRegister);
}

int MLX90640_GetData(uint8_t slaveAddr, uint16_t *frameData)
{
    int error = 0;
    uint16_t statusRegister;
    uint16_t controlRegister1;

    // Get page data
    error = MLX90640_I2CRead(slaveAddr, 0x0400, 832, frameData);

    // Get status reguster
    MLX90640_I2CRead(slaveAddr, 0x8000, 1, &statusRegister);

    // Get control register
    MLX90640_I2CRead(slaveAddr, 0x800D, 1, &controlRegister1);

    frameData[832] = controlRegister1;
    frameData[833] = statusRegister & 0x0001; // Populate the subpage number
}

int MLX90640_SetDeviceMode(uint8_t slaveAddr, uint8_t deviceMode)
{
    uint16_t controlRegister1;
    int value;
    int error;

    value = (deviceMode & 0x01)<<4;

    error = MLX90640_I2CRead(slaveAddr, 0x800D, 1, &controlRegister1);
    if(error == 0)
    {
        value = (controlRegister1 & 0b1111111111111101) | value;
        error = MLX90640_I2CWrite(slaveAddr, 0x800D, value);
    }

    return error;
}

int MLX90640_SetSubPageRepeat(uint8_t slaveAddr, uint8_t subPageRepeat)
{
    uint16_t controlRegister1;
    int value;
    int error;

    value = (subPageRepeat & 0x01)<<3;

    error = MLX90640_I2CRead(slaveAddr, 0x800D, 1, &controlRegister1);
    if(error == 0)
    {
        value = (controlRegister1 & 0b1111111111110111) | value;
        error = MLX90640_I2CWrite(slaveAddr, 0x800D, value);
    }

    return error;
}

int main(){
    static uint16_t eeMLX90640[832];
    float emissivity = 1;
    uint16_t frame[834];
    static float image[768];
    float eTa;
    static uint16_t data[768*sizeof(float)];

    MLX90640_SetDeviceMode(MLX_I2C_ADDR, 1);
    MLX90640_SetSubPageRepeat(MLX_I2C_ADDR, 1);
    MLX90640_SetRefreshRate(MLX_I2C_ADDR, 0b100);
    MLX90640_SetChessMode(MLX_I2C_ADDR);

    paramsMLX90640 mlx90640;
    MLX90640_DumpEE(MLX_I2C_ADDR, eeMLX90640);
    MLX90640_ExtractParameters(eeMLX90640, &mlx90640);

    MLX90640_StartMeasurement(MLX_I2C_ADDR, 0);

    while(1){
        while (!MLX90640_CheckInterrupt(MLX_I2C_ADDR)){
            std::this_thread::sleep_for(std::chrono::milliseconds(1));
        }
        MLX90640_GetData(MLX_I2C_ADDR, frame);
        subpage = MLX90640_GetSubPageNumber(frame);

        // Start the next meausrement
        MLX90640_StartMeasurement(MLX_I2C_ADDR, !subpage);

        eTa = MLX90640_GetTa(frame, &mlx90640);
        MLX90640_CalculateTo(frame, &mlx90640, emissivity, eTa, mlx90640To);
    }
}

In this mode, the RAM should never be overwritten because I'm explicitly requesting a new measurement start after I have read the data from RAM.

Now steps 1 - 5 run seemingly correctly but all I get out of the camera in this mode is barely correlated noise- a circle of almost maximum values, within a sea of "nan" or just a whole frame of evenly distributed noise.

My readings are taking around 57ms for the full set of data including the status and control registers.

I'm using a logic analyser to capture the data, and everything is happening as I expect it.

The nonsense data returned seems related to the refresh rate I specify, too:

It's worth noting that the sudden change form uniform noise to a blob happens at the transition from 8hz to 16hz which correlates with the point at which the camera refresh rate becomes faster than the i2c data rate (16hz is 62.5ms per frame). But since this is in step mode, that should be of no consequence?

In all cases a frame of data takes about 57ms to transfer. Why does changing the camera refresh rate drastically alter the results returned? Does step mode work at all? Is there some information missing from the datasheet we need to properly activate/use it?

image from ios 1 image from ios image from ios 2

Gadgetoid commented 6 years ago

For comparison here's a photo of it running in normal mode at 16hz and producing meaningful output.

image from ios 3

It's not evident in the photo, but your getFrameData code wastes so much time requesting the same frame data over and over that the practical framerate is only ~2fps.

Rather than taking just 57ms it's taking 170ms (measured with a logic analyser by pulsing a GPIO pin- that's about 3 times slower to pull a bunch of data over i2c that the code is just throwing away.

image from ios 4

Why do it this way? I've implemented dozens of i2c sensors and never seen anything like this. As @adibacco suggests that's a huge no-go for low-power implementations, but then continuous mode is not suitable for low power at all.

If I slow the sensor to 0.5hz it contests the i2c bus for a whole solid 2 seconds re-requesting the same frame of data over and over again:

image from ios 5

Note- the above is using your getFrameData method as described, and with no manual paging or other non-standard settings.

slavysis commented 6 years ago

Hi,

You cannot set the subpage to be measured. It is being automatically altered by the MLX90640 sensor and the register bits are simply indicating which subpage the read out data applies to. The bit is 'read only'.

The step mode is meant for a sort of synchronization rather than low-power mode. Note that even in step mode the MLX90640 consumption is considerable and might not be suitable for battery applications (especially small capacity batteries). The step mode could be used, but in very limited cases. This is because of the thermal effects inside the package which are different in continuous and in step mode. The device is calibrated in continuous mode, therefore the best performance is in continuous mode and step mode might give strange results at certain conditions.

The getFrameData method basically does two things:

  1. Waits for a new set of data = polls the 'new data available' bit. Note it does not read the frame data, but only polls the bit.
  2. Reads out the whole frame data making sure that no new data has been written meanwhile
    • if new data has been written, the currently read data is thrown away as is might be corrupted and a new data is being read out. After a certain amount of retries an error is thrown
      • if NO new data has been written during the process of getting the frame, the read out data is valid

Now, if the refresh rate you set is 0.5 Hz = 2sec period and the data processing time is 0.47 sec + 30ms i2c actual reading (depending on the i2c clk), you would get execution time for the getFrameData method of about 1.5 seconds. That is simply because there is no new frame data available. If you only change the refresh rate to 2Hz = 0.5 sec period, the execution time would be around 30ms (depending on the i2c clock) as there would be new frame available when you call the method. So I would recommend that you have some kind of a wait after the data processing such that you call the getFrameData method a bit before or even better - after there is a new frame available. In the 'recommended measurement flow' published in the datasheet the waiting time that is recommended is 80% of the refresh period, but that is a general value and could be further optimized in order to have stable data acquisition and 'silent' i2c lines. If we use the above values, it is better to have a 1.4seconds wait so that there is no constant communication on the i2c line.

Gadgetoid commented 6 years ago

Okay, I understand your response regarding step mode and it certainly explains why I haven't been able to get it working in any meaningful way. That's a shame, it would have been convenient as an easy way for users to grab the data into Python for processing.

It's looking like- realistically- I may be able to achieve 16 or 32 FPS with upscaling and processing to produce a GUI that will allow a beginner to make use of the camera. 64 FPS might even be possible, but I will have to experiment.

Thanks for your feedback so far- right now I have some examples that show the camera running at all possible framerates in continuous mode on the Pi, with automatic calculation of the wait period (based on how long the i2c transaction and drawing code take to run).

ackoc23 commented 5 years ago

Hi have you gotten it to work? I want to read from the mlx 910640 as fast as possible using I2C

JasonOsborn commented 4 years ago

@slavysis Apologies for bothering you so long after this issue was opened- But to clarify, when you mentioned "You cannot set the subpage to be measured" for Gadgetoid's code, is this referring to the SetDeviceMode() function or something else?

slavysis commented 4 years ago

@JasonOsborn Sorry for answering so late, but I just noticed your question. I mean that when both subpages are being measured the sensor automatically alters the subpages to measure and only notifies which subpage is being measured. So indeed, the SetDeviceMode() function is the one that I am referring to.