Microchip-MPLAB-Harmony / core

Harmony 3 Core
https://onlinedocs.microchip.com/v2/keyword-lookup?keyword=MH3_core&redirect=true
Other
17 stars 12 forks source link

DRV_I2C_WriteReadTransferAdd stops working after a few seconds #25

Open mumch opened 2 years ago

mumch commented 2 years ago

We use the I2C Driver (I2C1, single Master, single instance of driver using, sending 2 bytes everytime) of Harmony (Core Version 3.10.0) with a clock of 50 kHz on a PIC32MM0256GPM064. The XC32 Compiler Version is v4.00. We send cyclic DRV_I2C_WriteReadTransferAdd and after some time (few tens of seconds), the communication stops working. If we add a delay of 50 ms between the DRV_I2C_WriteReadTransferAdd calls, the I2C continues to work for a random time amount, but then still stops working. Examining the I2C with a scope shows SCL and SDA are pulled low by the PIC. And while it works, the Signals are clear and don't show any issue.

The hint found in https://github.com/Microchip-MPLAB-Harmony/core/issues/2 was helpful: If setting the gcc optimization-level to 1, the communication no longer stops working after some seconds. (Tested several times back and forth to be sure.) I don't understand why the other issue was closed, without investigating the reason and fixing the bug, as it still seems to remain almost 2 years later..!?

vishalnxt commented 2 years ago

@mumch , can you attach a basic project to reproduce the issue?

mumch commented 2 years ago

Dear @vishalnxt sorry, I can't add our entire project and I don't have time to create one, as this bug threw me much behind the timeline. But additional to the information above, I can give you further details:

The Harmony Project uses Core and NO RTOS, Core configuration in MHC: mhc_core_config

I2C Configuration in MHC: mhc_config

Compiler settings, which result in the issue occures: compiler_settings_where_the_issue_occures

Init:

app.i2cHandle = DRV_I2C_Open( DRV_I2C_INDEX_0, DRV_IO_INTENT_EXCLUSIVE /*DRV_IO_INTENT_READWRITE shows same behavior*/ );
        if(app.i2cHandle != DRV_HANDLE_INVALID) {
            app.i2cSetup.clockSpeed = 50000;
            if(DRV_I2C_TransferSetup(app.i2cHandle, &app.i2cSetup)) {
                app.state = APP_STATE_DELAY_UNTIL_REQUEST_IS_POSSIBLE;
            }
        }  

WriteReadTransfer (as a correction to the description above: one byte is sent and two are read):

 DRV_I2C_WriteReadTransferAdd(
            app.i2cHandle, 
            APP_I2C_SLAVE_ADDRESS, 
            app.i2cTxBuffer, 
            1,    
            app.i2cRxBuffer, 
            2, 
            &app.transferHandle);

I hope this allows you to reproduce it. Let me know, if anything is unclear. Thank you very much.

vishalnxt commented 2 years ago

@mumch , how do you know if the transfer has completed? In other words, can you share as to what you check to know if the transfer is still in progress or has completed?

mumch commented 2 years ago

Let me mention an additional point: You don't even need an I2C slave (of course the SDA and SCL pullups are required), you can simply address anything. If, for instance we address our slave which is not yet available and hence does not ACK, the issue occures too. It's addressed: i2c_working_but_slave_offline

And (if optimization-level is set to 2 and no delay is made between the requests), after 10 to 30 seconds: i2c_stops_working_pic_draws_scl_and_sda_low

vishalnxt commented 2 years ago

@mumch , thank you for the waveforms. But I will ask again, how do you ensure that the transfer has completed? Do you register a callback to get notified or do you call the DRV_I2C_TransferStatusGet() API. Please note that the call to the DRV_I2C_WriteReadTransferAdd API returns immediately. It only starts the transfer (or queues in the transfer if a transfer is already in progress) and returns. Rest of the transfer than takes place from the interrupt context. Application then needs to either wait for the callback (if registered) or poll the status of the transfer by calling the DRV_I2C_WriteReadTransferAdd() API.

mumch commented 2 years ago

To your question regarding handle the transfer: As soon as the I2C DRV_I2C_WriteReadTransferAdd() was made, we go to a State (APP_STATE_WAIT_ON_READ_DONE) to make sure, no further request is made. And in this State. we do:

appData.transferStatusEvent = DRV_I2C_TransferStatusGet(appData.transferHandle);

         //If timeout is over:
        if(appData.remainingTimeout_ms == 0) {
            appData.readResult.stillBusyAndNotYetReady=false;
            appData.readResult.successful=false;

            DRV_I2C_Close(appData.i2cHandle);
            appData.state = APP_STATE_INIT_I2C;

            if(app_datahndlrData.liveData.developerDebugOutputIsEnabled) {
                printf("** i2cResponseTimeout ** \r\n");
            }
        }

         /* Transfer request is pending */
        else if(appData.transferStatusEvent == DRV_I2C_TRANSFER_EVENT_PENDING) {
            appData.readResult.stillBusyAndNotYetReady=true;
            appData.state = APP_STATE_WAIT_ON_READ_DONE; //continue to wait

            if(appData.transferHandle == DRV_I2C_TRANSFER_HANDLE_INVALID)   {
                //If transfer handle is invalid, stop waiting
                appData.readResult.stillBusyAndNotYetReady=false;
                appData.readResult.successful=false;
                DRV_I2C_Close(appData.i2cHandle);
                appData.state = APP_STATE_INIT_I2C;
            }

        }

        /* All data from or to the buffer was transferred successfully. */
        else if(appData.transferStatusEvent == DRV_I2C_TRANSFER_EVENT_COMPLETE) {
            appData.readResult.stillBusyAndNotYetReady=false;
            appData.readResult.successful=true;

            appData.readResult.result = (appData.i2cRxBuffer[0]);
            appData.readResult.result += ((appData.i2cRxBuffer[1]) << 8);

            appData.remainingTimeout_ms = APP_I2C_MIN_DELAY_BETWEEN_REQUESTS_MS; //Add a timeout
            appData.state = APP_STATE_DELAY_UNTIL_REQUEST_IS_POSSIBLE;
            //DRV_I2C_Close(appData.i2cHandle);
            //appData.state = APP_STATE_INIT_I2C;
        }

        /* Handle error cases:
         * 
         * DRV_I2C_TRANSFER_EVENT_HANDLE_EXPIRED:
         *  Transfer Handle given is expired. It means transfer
         *  is completed but with or without error is not known.
         * DRV_I2C_TRANSFER_EVENT_ERROR:
         *  There was an error while processing the buffer transfer request.
         * DRV_I2C_TRANSFER_EVENT_HANDLE_INVALID
         */
        else {
            appData.readResult.stillBusyAndNotYetReady=false;
            appData.readResult.successful=false;

            DRV_I2C_Close(appData.i2cHandle);
            appData.state = APP_STATE_INIT_I2C;
        }            
        break;
    }

Notes: I don't think calling DRV_I2C_Close() and going back to init i2c where the DRV_I2C_Open() is opened again is necessary, if the i2c transfer event is not 'complete'. It's only one of many tryouts while investigating the issue. You see, we have also a timeout, which is currently 200 ms after which we stopp waiting. But this does not occur until the i2c stops working.

mumch commented 2 years ago

Sorry, I was adding the waveforms at the time you asked. And while you asked again I was writing the answer..

Yes, I am aware about the mechanism (asynchronous, etc). We poll and don't use callback as you can see above. The whole I2C communication works successfully as intended by us, so no problem there.

We just have the stop-working issue if optimization-level is set to 2.

mumch commented 2 years ago

Unfortunately, I have an update on this: The test just showed, I2C still stopps working after some time - although code-optimization set to 1 and 50 ms delay between next DRV_I2C_WriteReadTransferAdd(). But this time, SCL and SDA stay high. And all requests end in ' i2cResponseTimeout ', because the DRV_I2C_TRANSFER_EVENT stays DRV_I2C_TRANSFER_EVENT_PENDING.

In order to implement a Hotfix: Is there a possibility to reset the I2C1 and/or the DRV_I2C? This way we could reset it, when a timeout occures. (As DRV_I2C_Close() and reopen are not taking effect)

mumch commented 2 years ago

@vishalnxt In between, we have changed to directly using the I2C Peripheral Library. We observe the exact same behavior. Issue number 1 (SCL and SDA are pulled low by PIC) seems also being solved by setting the gcc optimization-level to 1. Issue number 2 (SCL and SDA remain high and PIC does not drive I2C anymore) occures too. We're currently investigating the number 2. I will update, once we know more here. But it would be nice, if you could reproduce and investigate number 1, as this looks like a behavior which must be solved at the origin point.

One question for now: In Errata of PIC32MM0256GPM064 are five issues listed (Number 9, 10, 11, 12, 13) regarding "I2C Client". We assume "Client" means Slave. But this unfortunate terminology is maybe misleading. Can you confirm, "I2C Client" is equal to "I2C slave"?

vishalnxt commented 2 years ago

@mumch Yes your understanding about I2C Client is correct. I2C Client means I2C Slave.

mumch commented 2 years ago

@vishalnxt I have an update to issue number 2 (SCL and SDA remain high): We have debugged and logged the UART output to file. At the time when the problem begins, our firmware part runs once into the timeout and after this, nothing can happen again, because the I2C1_IsBusy() always returns true. And the reason for this is the I2C1STAT bit S, which stays 1 forever. Here is a printscreen of this situation: 2022-04-01_reason_why_isbusy_returns_true_although_state_is_i2c_state_idle_to_publish

mumch commented 2 years ago

@vishalnxt Any results from you regarding the Issue number 1 (SCL and SDA are pulled low by PIC)? Or are you still investigating this? It is very important to solve this issue properly. Since configuring the gcc optimization-level to 1 instead of 2 is definitely not a true "solution", especially not because it still does not give certainty regarding reliability in long-term.

Regarding issue number 2, we're running some tests with hacks (disabling and reenabling the I2C module). We're waiting on the next occurence to see, if it works. This is something which needs to be placed in the Errata of PIC32MM0256GPM064. Can you arrange this? Thank you.

vishalnxt commented 2 years ago

@mumch , we are looking into it. We will update once we have something to share.

vishalnxt commented 2 years ago

@mumch , I ran some tests with Optimization set to 2 and continuously writing and reading from an external EEPROM, however I did not see any of the above issues. I am attaching the MPLABX project. The demo is run on PIC32MM USB Curiosity board, and a I2C EEPROM click board is used as the I2C slave. The I2C EEPROM click board is connected to J12 mikroBUS header. Can you take this and see if it helps?

i2c_eeprom.zip

mumch commented 2 years ago

Hi @vishalnxt Thank you for your endeavors. Issue 1 should occure in short time, especially if the I2C Slave does not answer. And regarding issue number 2, although we have implemented a Fix in between (disable and re-enable and reinit I2C if it runs in the timeout), it didn't occure since one week. So this issue may happen extremely rarely, but if it does, it blocks further communication forever.

I will make a sample-project for you based on the PIC32MM USB Curiosity board, which allows to reproduce it. I will come back to you soon, but not in the next days, as I am working on the next topic now

vishalnxt commented 2 years ago

@mumch , just checking if you are still blocked on this issue?

mumch commented 2 years ago

Dear @vishalnxt We launched the product with the dirty workaround and ran long time test (up to now it seems working with the aforementioned dirty workaround). Of course this is far away from a well developed product. But other tasks came in between and got priority. The development of a sample project which allows reproducing the issue is still on my todo list. I hope being able to work on this issue in the near future. Sorry for the delays.

vishalnxt commented 1 year ago

Hi @mumch , Its been a while there is any activity on this ticket. Wanted to check if you are still blocked on this one or have any new updates to share?