bigtreetech / BIGTREETECH-TouchScreenFirmware

support TFT35 V1.0/V1.1/V1.2/V2.0/V3.0, TFT28, TFT24 V1.1, TFT43, TFT50, TFT70
GNU General Public License v3.0
1.28k stars 1.63k forks source link

[Q] Bad serial connection, advanced ok gives checksums, without printer freezes. Next steps? #2933

Open Norvat opened 2 months ago

Norvat commented 2 months ago

I am running a TFT 35 E3.0 using firmware BIGTREE_GD_TFT35_V3.0_E3.27.x (Pre-Compiled)

Mainboard SKR 1.3 with tmc 2209 steppers in UART running marlin 2.1.2.2

I cannot get a stable connection between the tft and mainboard. See pictures bellow for errors.

What i have tried so far:

image

image

image image image image

It seems it has some serious character loss

All dependencies in marlin is included, and all features in the tft work as they should.

Zip of marlin if that is helpfull: Marlin-2.1.2.2.zip

Running the printer from pronterface works and does not give errors.

I am currently at a loss about what i should try next to mitigate the problem, any help is greatly appreciated

ChihebMadiouni commented 2 months ago

we have the same problem with MKS TFT35 , MKS ROBIN Nano V3 on 25 machines i have tested all the baudrates and ports same issue the print always stops mid job , when printing from the TFTSD or the TFTUSB we are using the marlin 2.1x bugfix with advanced_ok active ,

Norvat commented 2 months ago

@ChihebMadiouni Hi, its good to know im not alone with these issues. What troubleshooting steps have you tried?

kisslorand commented 2 months ago

@Norvat @ChihebMadiouni Guys, I highly recommend try the firmware from my repository.

Norvat commented 2 months ago

@kisslorand running a test print now, will report back

digant73 commented 2 months ago

if you don't have a reliable connection, simply disable advanced_ok and command_checksum.

Norvat commented 2 months ago

@digant73 Disabling those features causes the printer to freeze. Do you have any ideas how i could troubleshoot?

Norvat commented 2 months ago

@kisslorand Your firmware returned a "unknown command" error 30 min in, but a lot better than how it was running with the stock firmware. Any tips to how i could troubleshoot the serial connection? I have made my own shielded cable so the cable should be good.

image

Baudrate is currently set at 57600

ChihebMadiouni commented 2 months ago

i tested a 1 hour print with @kisslorand FW version with 250k baudrate and the print finished no errors and no stops however i will continue testing to be sure

Norvat commented 2 months ago

@ChihebMadiouni Do you think running a lower baudrate is causing my problems? I set it lower during testing to make sure i wasn't overloading the tft or mainboard

Edit: Printer froze again, now with a "busy processing" error. Still possible to controll printer after stopping the print job, so not a hard crash. image

It might be an issue with my hardware then? Any easy way to test?

Edit 2: I have also moved the cable away from the rest of the wires going to the toolhead, without helping. image

kisslorand commented 2 months ago

@Norvat I think you have high level of EMI at your printer. I see serial communication problems in both directions (TFT->MB, MB->TFT).

I would check the pull-up resistors on both the TFT and the motherboard on the RX & TX line, if they are 10k or higher I would change them to 4.7k.

Later I will check some boards I have around, I remember something about BTT usually using fairly high values for pull-up resistors.

Later edit: I just checked a few boards and TFT, it seems BTT uses internal pull-up both on the MB and TFT, only MKS TFT has an external pull-up resistor on the TX pin.

ChihebMadiouni commented 2 months ago

@Norvat i recommend you switch to marlin 2.1.x bugfix the release 2.1.2.2 may have some bugs also some features i have enabled in marlin maybe it will help #define ADVANCED_OK #define TX_BUFFER_SIZE 32 #define RX_BUFFER_SIZE 1024 #define MAX_CMD_SIZE 96 #define BUFSIZE 4 #define SERIAL_OVERRUN_PROTECTION #define SERIAL_DMA #define EMERGENCY_PARSER

Norvat commented 1 month ago

Thank you so much for suggestions

I will investigate further later this week

digant73 commented 1 month ago

@digant73 Disabling those features causes the printer to freeze. Do you have any ideas how i could troubleshoot?

The printer freezes when the TFT doesn't receive an ACK from the mainboard (e.g. due to EMI). When the TFT has no more available TX slot (1 in case advanced_ok is disabled) (it means also the TFT has pending commands that will be no more acknoledged) then the TFT will not send any further command

MrKuskov commented 1 month ago

@Norvat у меня были те же ошибки на B3 вот мои настройки. Все работает корректно. https://github.com/bigtreetech/BIGTREETECH-TouchScreenFirmware/issues/2910#issuecomment-1988557561

Norvat commented 1 month ago

Did some more testing.

Tried to print without the heated bed, and printed for an hour without errors. I then set bed temp, and the printer ether stopped right away or did some weird moments that isn't in the gcode. It seems to be especially bad while heating up.

Its a 24V 500x500mm bed (Probably pulling 15A), and the controller is placed about 5cm away from it. The original controller board was in a metal enclosure but i have made a plastic one to fit the SKR board, so i think the issue is EMI from the heated bed. I tried to pack the electronics enclosure it in aluminum foil that was grounded, but it did not work. My wrapping might have been a bit shoddy.

I am going to try to move the electronics away from the printer next.

rondlh commented 1 month ago

@Norvat Please be careful with Kisslorand's closed source firmware. It's known to be buggy, unreliable and slow. His FW is far behind what you can find here. There is a reason why his "contributions" are ignored https://github.com/bigtreetech/BIGTREETECH-TouchScreenFirmware/pulls. Also note that Kisslorand has claimed to have tested lots of things, but that turned out to be complete nonsense.

From your screenshots I can see that you have serial data corruption. Using checksum will detect and report this, disabling checksum does not solve the issue, it just doesn't tell you when a serial communication error occurred.

You already did a lot of useful test. Make sure your serial wires are not close (in parallel) to your stepper cables. Could your power supply be faulty or overloaded? If possible measure the board voltages with a scope during printing. BTW: I'm running a similar setup (BTT TFT35 GD 3.0 + LPC1768 but not SKR board) without issues at 1M baud. I also have a bed that draws 12A at 36V, I keep these wires separated from all low signal wires like serial communication wires.

Norvat commented 1 month ago

@rondlh Thank you for the helpful comment. I will do some measurements after i get access to my oscilloscope again in a week or two.

The psu is the original that came from tronxy (this is a tronxy x5sa 500 pro), so it might not be the best. I have a smaller spare psu, so i will try to run power to the bed separately from the controller.

If the psu is the issue, would it be due to a varying voltage or some frequency?

rondlh commented 1 month ago

If the psu is the issue, would it be due to a varying voltage or some frequency?

That's difficult to predict. To me it's clear that there is something seriously going wrong. The PSU could be noisy, and/or insert high frequency noise or even dip down to a low voltage for a short time when overloaded. If the voltage would dip, then the MCU on the motherboard and TFT would probably freeze, so that's not very likely, but a scope will show you.

You could actually do some test without heating the bed, that might help to point you in the right direction. I recommend to leave the TFT checksum feature on, so you will be informed if any serial corruption takes place.

Another thing you could try is to add an external serial connection to listen to the serial data flow, make sure to only listen (RX of your added port, don't connect TX). On your computer you then can use Putty to see the data flow. So you connect your external RX to the RX or TX on your motherboard. If you connect it to RX you can see the data the motherboard receives from the TFT, if you connect it to TX you can see the data the motherboard send to the TFT.

digant73 has recently added the "info screen" that can help to diagnose the problem. He left a few debug lines at the end of Monitoring.c. You can uncomment the "if" and one of the first 3 "mustStoreCmd" lines (FOR TESTING ONLY!!!). The code makes the TFT send commands to the motherboard as fast a possible, to which the motherboard responds. If everything is going well then you should see lots RX/TX commands and data in the info screen, the numbers should be relatively stable. If they go to 0 then the serial communication has broken down.

Norvat commented 1 month ago

Small update:

I have a separate PSU to the controller, and that seems to have done the trick with the errors, and the printer runs happily.

BUT!

If i ground the cable from the controller to the tft, i get errors. I can start a print without it connected and the second i connect it i get errors.

I tried scoping both the ground on the original psu and i am getting ripples of +-3V. I see +-0.5V through the shielding on the tft cable while it is not grounded. Almost no ripple from the new PSU. I tried getting pictures of the readings but the oscilloscope is an old analog one, so not too easy.

So i think the issue is psu related as rondlh says

Will test a bit more scientifically later this week, and try the tips above

karabas2011 commented 1 month ago

MKS Eagle marlin 2.0.9.3 and 2.1.2.2+ MKS35 TFT with last btt dev firmware. I compiled it.

Printer stops suddenly. Not hanging. With or without advanced_ok / crc. Tried all. Terminal is on for monitoring. No garbage or missing ok. They were in some configuration. It seems as TFT simply stops sending next gcoode command.

Finally tried kisslorand's firmware - Success of14 hours print. but speed above ~60mm/sec causes pause ~1sec

karabas2011 commented 1 month ago

I flashed Marlin 2.1.2.3 yesterday + last BTT fw. Shortened cable to 20cm. Nothing changes. Sudden stop. Tried to move to 2.0.7.3 but cannot compile for MKS EAGLE

As I can see, TFT continues to receive messages from board but stops to send anything. Pressing abort button fixes it. On TFT terminal no garbage. Stops after receiving Ok

rondlh commented 1 month ago

@karabas2011 Please be careful with Kisslorand's closed source firmware. It's known to be buggy, unreliable and slow. His FW is far behind what you can find here. There is a reason why his "contributions" are ignored https://github.com/bigtreetech/BIGTREETECH-TouchScreenFirmware/pulls. Kisslorand's closed source FW might damage your motherboard and/or your TFT display, so better stay clear. Also, the topic you raise here seems to be unrelated to the issue raised by Norvat, please start a new issue if you need support.

digant73 commented 1 month ago

As I can see, TFT continues to receive messages from board but stops to send anything. Pressing abort button fixes it. On TFT terminal no garbage. Stops after receiving Ok

on TFT side, try to disable as much features as possible such as advanced_ok, command_checksum, event_led, file_comment_parsing. Also, when the TFT stops to send gcodes to mainboard what do you see in the TFT's stats page? do you see Free TX slots to 0 and Pending gcodes different than 0?