MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.21k stars 19.22k forks source link

[BUG] Shifty I2C OLED on LPC1768 #14431

Closed istyszy closed 3 years ago

istyszy commented 5 years ago

Hello,

I have found this topic on the bugs list and it is closed, but the problem still exists.

60039104-9ccacd80-96be-11e9-8df6-46cd2c73edd5

I have an i2c OLED display, with marlin 2.0 (2019-06-24) + SKR V1.3, i use hardware i2c from E1. When the board starts everything is OK, display works as it should, but after 40 sec or so, the display crashes. Did someone had this problem? I used the display with ramps+mega and never had any issues? I use U8GLIB_SH1106. (also tried U8GLIB_SSD1306 same result). My diy display: https://reprap.org/forum/read.php?13,499572,500608#msg-500608

I have tried all the solutions on https://github.com/MarlinFirmware/Marlin/issues/13257 even with a 3.3V regulator. the problem remains. display artifacts after some time.

yet-another-average-joe commented 3 years ago

I gave up with the OLED and hooked up an EEPROM module : daisychained with 150mm Dupont cables from the OLED to the EEPROM. Total I²C length = 450mm between the EEPROM and the board, the OLED in between. It's been 3H30 hours since it was done. Since then, the display never shifted ! (the ferrite is still there, I will remove it at some point). I don't give up anymore :)

Crashes/reboots still occur from time to time (last 1H30 without).

What's happening is unclear. I've been playing with these OLEDs hooked to STM32s and Arduinos with 600mm (2x 300 Dupont) lying among all sorts of cables, including (twisted !) RS485 (total 6 populated breadboards) and I²C LCD, without an issue.

This EEPROM has 10k pullups. Therefore, the bus is now terminated with 4k7 on the SKR, and 10k on the EEPROM.

Adding 4k7 pullups on the OLEDs had no positive effect (it was worse). Adding an EEPROM seems to make things better (by far).

Soon, I'll test without the EEPROM but with 10k pullups instead of 4k7. Just to see. The scope was showing no difference with or without (rise time). But it seemed that with the probes, display shifting took a bit longer (10 mn vs 5mn). Also, the scope shows that at the default speed, rise time is not negilgible (order of magnitude = half the clock pulse width). Maybe the default speed is a bit optimistic with this board ? Could someone tell me where the I²C speed could be be adjusted ?

yet-another-average-joe commented 3 years ago

No idea if this could be related, but...

The scope shows a very short "negative" spike at the begining of the initialization sequence (yellow = SDA, blue = SCL). A few tens of nanoseconds, before the signal raises again and then falls down. And shouldn't SCL be at 3.3V when idle ? Is this normal ?

pic_125_7

yet-another-average-joe commented 3 years ago

Solved !

No more crashes.

No more shifting.

Just remove U8G_I2C_OPT_FAST. With this option I²C frequency is 400kHz. Seems too much or bad choice for these displays. Without the option : 100kHz. Verified with the scope.

The display seems a bit less responsive, but not even sure...

No idea why 400kHz causes problems. The SSD1306 datasheet tells the oscillator frequency is typically 370kHz (min = 333kHz, max 407kHz). Could there be some interference between a 400kHz I2C SDA signal and the oscillator ?

The devs should add an option in Configuration.h for I2C frequency control.

[EDIT] just had a crash, but no evidence it is related with the display. (no motors, no endstops, a crappy SD card, an old and cheap PSU, a crappy 3DTouch I use for tests only, and a RasPi with some bad tweaks)

atoomnetmarc commented 3 years ago

You are onto something. But....

I modified marlinui_DOGM.h and removed U8G_I2C_OPT_FAST from the SSD1309 section. Now the display does not shift anymore. The display is now noticeably (and understandably) slow. I used this recent version https://github.com/MarlinFirmware/Marlin/tree/d17db477753067982b2cbf277609f4d66b0b57a9

However, I am now unable to print anything. The printer resets just after homing and trying to prime the nozzle. Also, when the printer is idle, I noticed a sudden reset.

After reverting the change I can print again (with a shifty display) and no sudden resets occur when idle.

iotricity commented 3 years ago

The I2C displays I've tested with can do even 800kHz SCL/SDA on an ESP32 using bit banging code. The same displays fail on the LPC1769. If data was corrupted, it would garble the display's content and the lines would jump up and down if the next display/frame update is okay. This is not the fact. The shifting is a constant process, not a glitch.

As far as I've discovered and monitored there is a shifting in the frame buffer offset. All I2C data is correctly clocked at 400kHz, no data errors. Maybe the display parameters change during time. I haven't been able to discover where that happens.

yet-another-average-joe commented 3 years ago

Yes, shifting looks like the 1st frame replaces the 8th one, the 8th goes to first, and so on (AFAIR, there are 8x, 8px height frames). Looks like a rotation or a ring buffer with shifted indices.

On my side, with the 62680bb commit, I had another crash, but now it's been stable for 3 hours (no shifting no reboot). Obviously, this thing is still unable to print. But was stable for one full day. It was monday and it was full moon. The next day, after some reorganization on the breadboard, shifting was back.

On u8glib Git, there's not much information about speed. Somewhere, olikraus says Wire is poorly documented about speed. Unfortunately, he does not support u8glib anymore...

Wiring : Until now, the EEPROM and the OLED got +5V. Now it's 3.3V, and everything is pulled to 3.3V instead of 5V. Reverted to U8G_I2C_OPT_FAST. But it can take up to one hour or more... Wait and see. Preparing the last commit for flashing.

yet-another-average-joe commented 3 years ago

Definitely solved the problem using SPI. Many OLEDs are SPI/I²C ; better use them as SPI : MKS_12864OLED_SSD1306 in Configuration.h, some editing in pins_BTT_SKR_V1_4.h, marlinui_DOGM.h and marlinui_DOGM.cpp ; done !

KingTomaHawk commented 3 years ago

Definitely solved the problem using SPI. Many OLEDs are SPI/I²C ; better use them as SPI : MKS_12864OLED_SSD1306 in Configuration.h, some editing in pins_BTT_SKR_V1_4.h, marlinui_DOGM.h and marlinui_DOGM.cpp ; done !

Thanks that sounds great. Can you tell me , what exactly you changed in these files? I want to test it on my printer.

yet-another-average-joe commented 3 years ago

Not sure it is the right place for explanations about wiring a SPI OLED ; this thread is a bug report about I²C, and it isn't really a solution ! I will shoot a short video and add the link to this post and RepRap forum when it will be done (have to verify some details and shoot one or two pics). Should be more explanative because of compilation errors that have to be fixed (MKS_12864OLED_SSD1306 is not supported by Marlin with Re-ARM). Be aware that printing from LCD-SD could lead to problems : some users have been reporting stuttering with LCD Touch displays. Will need a couple days, but will do it !

obertr0n commented 3 years ago

The I2C displays I've tested with can do even 800kHz SCL/SDA on an ESP32 using bit banging code. The same displays fail on the LPC1769. If data was corrupted, it would garble the display's content and the lines would jump up and down if the next display/frame update is okay. This is not the fact. The shifting is a constant process, not a glitch.

As far as I've discovered and monitored there is a shifting in the frame buffer offset. All I2C data is correctly clocked at 400kHz, no data errors. Maybe the display parameters change during time. I haven't been able to discover where that happens.

@iotricity if this is the case, how comes that patching the code in Marlin/src/HAL/LPC1768/u8g/u8g_com_HAL_LPC1768_ssd_hw_i2c.cpp appears to solve the issue?

I'm not an expert on I2C nor SSD1306, but I cannot understand why that code fixes the issue. Does it avoid the corruption somehow? I am using and 0.96' OLED with an encoder as a controller for the printer for over one year with an SKR 1.1 and there was no shifting with that workaround.

@yet-another-average-joe can you give it a try with the above mentioned file patched in your build with your hw setup?

yet-another-average-joe commented 3 years ago

I modded Marlin according to your instructions. I didn't obeserve any shifting. But unforunately, it reboots after half an hour or so (3x). Keep in mind my setup is non functionnal. It is just a SKR 1.4 Turbo, a RasPi, a 7" HDMI touch LCD, and a beardboard. It can't print.

iotricity commented 3 years ago

@yet-another-average-joe The rebooting is probably caused by the watchdog that times out because updating the display takes too long.

@obertr0n Sending a new I2C start condition before every byte causes a massive delay. Sure it solved timing problems and spikes because data transfer only includes 1 byte per send condition. There might be a wiring problem in your setup that causes a lot of noise, spikes or timing issues on the I2C bus.

Since the display first shifts two rows after a few minutes and later on another two rows, and then stays stable without any other erratic behaviour, it must have something to do with the display offset. According to the documentation (https://www.velleman.eu/downloads/29/infosheets/sh1106_datasheet.pdf) there are two possibilities for shifting rows. The first is on page 20, item 4, which sets the displays starting line in the RAM. If you change this value, the entire display content is written starting N-rows offset. The other possibility is on page 24, item 14, which sets the display offset in rows. This option allows shifting the rows when display content is already stored in the display's RAM.

Both options can exactly reproduce the behaviour of the bug. However, when pressing the reset button on the SKR or disconnecting the I2C bus will cause an I2C communication disruption. The processor on the display detects a wrong I2C bus condition and probably resets any of the offsets, causing the display to jump back to the right offset (starting at row 0) and the display content is no longer shifted. Unfortunately there's no documentation on this process.

yet-another-average-joe commented 3 years ago

@KingTomaHawk

I answered to you here : https://reprap.org/forum/read.php?415,879480 (and somewhat cleaned and simplified things)

github-actions[bot] commented 3 years ago

This issue has had no activity in the last 30 days. Please add a reply if you want to keep this issue active, otherwise it will be automatically closed within 7 days.

ellensp commented 3 years ago

no bot no!

jatmn commented 3 years ago

Just wanted to note I see this issue as well with SKR 1.4 Turbo (LPC1769) and the Ultimaker 2 SD/Screen/Knob combo (ULTI_CONTROLLER). Looks like the screen shits just under 7min for me each time (6:45ish)

Dying to update my UM2 to trinamic drivers without losing the use of my screen :( I can see a lot of UM2 owners wanting to make a transition if we can sort out the issue.

Can offer to help test changes however unfortunately I don't have a scope for additional data collection.

yet-another-average-joe commented 3 years ago

@jatmn with a scope, it's impossible. Shifting could appear after a variable time : from a few seconds to... infinite ! Means using a logic analyzer with a ring buffer and some sort of triggering. Maybe PulseView as such plugins. And even knowing what goes on the bus does not tell why. This being said, the SSD1309 protocol is pretty simple and easy to analyze (just wrote an emulator for the SPI version : https://www.youtube.com/watch?v=vd2Wgo4UeBw). For now I can't help : the SKR 1.4 and the logic analyzer are in use for another display emulator (reprap full blah...).

jatmn commented 3 years ago

@yet-another-average-joe it was a general reference that I don't have the means of capturing additional information beyond observing. Good to know this cant be seen on a scope however.

Also my finding, at least so far.. is very much not random its within a significantly more narrow time frame.

Very cool emulator though :o

yet-another-average-joe commented 3 years ago

I did some more investigation...

I reflashed a firmware for the SSD1306, I²C mode. Communication is a bit different from SPI mode, but not that much. 4x 256 bytes double pages (instead of 8x 128). An address is at the begining of each page. For I²C, we read :

... B0...[2x PAGE]...B2[2x PAGE]... ... B6...[2x PAGE]...

After the display shifts, we read exactly the same. I compared the data patterns before and after shifting : they are the same, and there's no address shifting (was one of my hypotheses). The problem is elsewhere (no page addressing problem, no data shifting problem)

Page numbering :

page_number

Before :

before

After (with shifted display) :

after

So, somewhere in between, there's something that makes the SSD1306 consider that :

page 0 is page 7 page 1 is page 0 page 2 is page 1 page 3 is page 2 ...

This is weird, as pages are sent 2 by two : 4 pairs of pages, 4x 256B (in SPI mode it is one by one, 8 pages of 128B)

Now looking at using an Arduino or STM32 for sniffing the bus in the hope of detecting a shift command somewhere...

The problem is most likely deep in u8glib, and has been and abandonware for a long time. I doubt we could get any help from olikraus...

Just now : I looked again at the display : it was normal again, meaning the motherboard crashed and rebooted... I had that problem all the time, reason why I gave up with I²C for my emulator.

@jatmn : the analyzer I use was 3.5 USD including P&P from Ali, and the software is PulseView (free, open software) ; no need for sophisticated and expensive things !

[EDIT] the shifing is not even exactly one page. Each page is 8 pixel height. It appears shifted by 7 lines.

yet-another-average-joe commented 3 years ago

I spent the afternoon on the problem. With no success (one more time). There are some weird facts.

I tried this : removing (commenting) all 0x02e initialization commands ("deactivate scroll", as it isn't in the datasheet example). As a result, I got crashes/reboots before the display shift (aftera much longer time).

The more I think of this, the more I think the problem is not at u8glib level, but at I²C level...

@jatmn : what pins are you using ? On my side, on the LPC1769, it's 0.0 and 0.01 : the I²C connector.

Also, when hitting the reset button, the shifted display un-shifts just before the boot sequence starts. Why ??? This makes no sense. Looking at the analyzer, there's nothing special. Could this be a Re-ARM silicon bug ? It is heavily time consuming, as we have to wait for a while before the glitch happens.

jatmn commented 3 years ago

I am using 0.0 and 0.1 on the I2C connector as well.

Something I noted and not sure if you saw or is possibly just not relevant. If you manually change the screen by going into the menus it corrects its position as well.. and exiting to the main screen again results in a fixed screen position, at least until it shifts again.

You also noted a hard crash, just thinking out loud here as im not a programmer. I recently helped point to a memory loop bug on the LPC176x when using IDEX, motion system appears to had a bug in it resulting in a loop process and overloading the memory and crashing.. Wonder if there is more of these types of issues still in other processes on the LPC176x?

yet-another-average-joe commented 3 years ago

I had a look to the schematics (one more time...) ; 0.0 and 0.1 come directly from the MC. They are normally used for the EEPROM module, and nothing else is hooked to the lines. With a EEPROM, I couldn't see any activity on the I2C bus while issuing M503's. Looking at the code, the EEPROM seemed to be emulated. Found issue #17799 I forced I2C_EEPROM and E2_END 0x7FFF ; still nothing. Is I²C cursed ? It's (very) late, will search issues another time.

When I go into the menus, nothing changes. Still shifted.

Had a look at the Ultimaker OLED panel : it's very undestandable you want to use it !

jatmn commented 3 years ago

Main reason for my want to use the UM panel is because I have a clone and a genuine um2 I want to be moved over to 32bit with silent drivers. (they are kinda loud machines). Retrofitting a 3rd party display to the body is kinda hacky..

Anyhow just wanted to note I picked up a SKR Pro 1.2 (STM processor) and so far it appears to support the screen just fine.. Something for sure up with the LPC processor or how the SKR boards are wired..

You noted the header was for EEPROM which they sell a module for.. but feels odd as marlin can emulate eeprom on the LPC.. weird..

github-actions[bot] commented 3 years ago

This issue has had no activity in the last 30 days. Please add a reply if you want to keep this issue active, otherwise it will be automatically closed within 7 days.

nikki-reprap commented 3 years ago

I’ve got this same problem as @istyszy has a fix been found?

github-actions[bot] commented 3 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.