bigtreetech / BIGTREETECH-TouchScreenFirmware

support TFT35 V1.0/V1.1/V1.2/V2.0/V3.0, TFT28, TFT24 V1.1, TFT43, TFT50, TFT70
GNU General Public License v3.0
1.3k stars 1.65k forks source link

[BUG] Randomly M21/M1 #2295

Closed Randwic closed 2 years ago

Randwic commented 2 years ago

Description

M1/M21

Steps to reproduce

  1. Try print with Sd TFT port
  2. Try print with Isb TFT port

Expected behavior

No print error

Actual behavior

I have M21 or M1 errors randomly when printing ### Hardware Variant TFT 70 V3.0 ### TFT Firmware Version & Main Board Firmware details TFT 3.27.0 and Marlin 2.0.9.2 ### Additional Information * Hello, first i'm sorry for my english but i'm french and again sorry if i'm in the wrong section but this is the first time i've used a request on github. I recently installed on my FlSun super Racer a Skr v2 card and a Tft70 from Biqu. However since that time I have had bugs in my printing sometimes the printer stops with the M21 error or the M1 error without having configured a pause in the Slicer. It has also happened to me that printing stops without error on the TFT. I have tried printing with the SD card as well as the USB from the TFT but I have the same concerns however I have no problem securing the USB stick on the motherboard. I have the latest version of TFT as well as the latest version on the motherboard would you have any idea of ​​the problem thanks in advance.
V1EngineeringInc commented 2 years ago

Removing M0 is not an option for us, we use it for tool changes.

ACNC88 commented 2 years ago

I'm willing to try kisslorand's mod, but I'm still skeptical this has to do with the TFT. I used the same tft with 2 different minis v2.0 boards and never had this issue before. I only have the M1 message, and it does show up with the printer idle too. Another bizarre thing with this board is that, occasionally it pauses for a fraction of a second, then keeps going with the job. I'm not sure if it has to do with the linear advance, since I just started using it with this board. Im also going to compile my own version of the firmware, with changes to the buffer size as suggested by others here. This is quite annoying as I can't rely on my printer with this board in it and I don't plan on using the display in marlin mode.

kisslorand commented 2 years ago

I'm still skeptical this has to do with the TFT

It might not come from the TFT but it's already a fact that it comes from the serial where the TFT is connected to. Marlin responds (feedback) to M0, M1, M21 only to the port from where it came and it responds to the port where the TFT is connected. When the printer is idle, there is no other command sent to Marlin than M220 and M221. These are also sent during printing. By separating M220 from M221 I just followed a common sense logic, there's nothing else. It's a theory that is yet to be proven right or wrong. Either way it doesn't do any harm.

ACNC88 commented 2 years ago

Yeah sorry, I agree with your reasoning. What I meant is, why the v2.0 has no issues handling M220 and 221 together? By the way can someone from BIQU step in to help too?

kisslorand commented 2 years ago

More than 48 hours with my test rig on with M220 and M221 separated and not a single hiccup... Anyone else (especially those who complained) cares to join testing the patch??? I am not doing it for myself... There will be no patch made into the FW on my behalf until there's no feedback from others.

By the way can someone from BIQU step in to help too?

By the way, anyone who complained can step into and try the suggested patch ?

For those who are bothered by this bug and cannot make themself the necessary changes, just let me know what TFT you have and I'll compile a FW for you.

mustang5269 commented 2 years ago

I made the changes and will test to see if it works.

mustang5269 commented 2 years ago

After making changes, I've left mine on for 10 minutes and got the M1 stop message. I've also had a no sd card message pop up as well.

BradisBowser commented 2 years ago

I will make changes today and report back

mustang5269 commented 2 years ago

I went back into marlin and set the buffers back to stock and I haven't had the errors pop up as of yet. I've noticed a few who raised the buffers as I did. I followed a skr mini e3 v3 guide who suggested raising them and I wonder if that's why people are having the same issue.

mustang5269 commented 2 years ago

if I'm not wrong, I had similar issues in the past with MKS GEN L using the default values:

define BUFSIZE 4

define TX_BUFFER_SIZE 0

and I solved that issue using the following settings in Configuration_adv.h:

define BUFSIZE 16

define TX_BUFFER_SIZE 16

With those values I never had any issue

I think this solved the issue for me as well.

V1EngineeringInc commented 2 years ago

We use the SKR Pro, and have not made any buffer changes. Still have the issue.

kisslorand commented 2 years ago

After making changes, I've left mine on for 10 minutes and got the M1 stop message. I've also had a no sd card message pop up as well.

This is bad news. :(

How ironic is that after 60 hours of monitoring the serial communication with M220 and M221 separated I had not a single issue and now I left my test rig on for two hours with a FW that has not separated M220 and M221 and no monitoring was enabled and I had an M21 strike.

Regarding the buffers... I have BUFSIZE 16 and TX_BUFFER_SIZE 0 for two years now and never had any issue with M0, M1, M21 until I changed to SKR v2.

What I am more surprised is that on Marlin's Github there are no such issues reported or at least I couldn't find any.

Later edit: after 20 minutes I had another M21 with a FW with M220 and M221 sent together.

We use the SKR Pro, and have not made any buffer changes. Still have the issue.

I see it has the same MCU as SKR v2 RevB (first batch). It might be that only fast MCUs have this problem.

V1EngineeringInc commented 2 years ago

That seems at least a little promising. Some common thread.

After seeing the buffer post I started to dig deeper into the configuration_adv.h file and there are a ton of options that are fairly new. I would just not know where to start. Any sort of debugging we can do?

https://github.com/MarlinFirmware/Marlin/blob/bugfix-2.0.x/Marlin/Configuration_adv.h#L2248

kisslorand commented 2 years ago

Well... I found something... By accident a partial serial communication was logged on my PC. when M21 stroke. It can be seen that Marlin responds to M220 and than responds to M21. There's no response for M221. Before and after the M21 appearance there is a response for M221 after the response to M220. For me it is clear that this pair of successive M220 + M221 is an issue. Having a partial log I cannot determine if TFT sent M221 or M21 but I am sure Marlin processed an M21 command and never had an M221 after that M220.

Here's a part of the log:

wait
wait
FR:100%
echo:E0 Flow: 100%
wait
wait
FR:100%
echo:No SD card
wait
FR:100%
echo:E0 Flow: 100%
wait
wait

I don't know if it is important but there's only one "wait" after "No SD card". In any other situation there's two of them. I am thinking maybe Marlin responds too fast while the TFT is transmitting hence M221 is truncated (M21 or M1) and there's a second "wait" missing because the TFT was still transmitting.

Note

"FR:100%" is a response to M220 , "echo:E0 Flow: 100%" is a repnse to M221 and "echo:No SD card" is a response to M21.

V1EngineeringInc commented 2 years ago

But when you were running separate commands, you still had the issue happen?

I do get random sd card messages as well. I wonder what happens if we run off the USB stick instead.

kisslorand commented 2 years ago

But when you were running separate commands, you still had the issue happen?

No, as I said earlier I had my test rig on for 60 hours with M220 and M221 separated, monitoring was permanently on and not had a single misbehave.

In the last 3 hours or so with M220 and M221 not separated I had four M21 response. I do not know anything about M1 because I disabled it in Marlin.

V1EngineeringInc commented 2 years ago

Oh, I can give it a shot. If that is the case, opening a PR request to have that changed either way is probably a good idea.

mustang5269 commented 2 years ago

I haven't had any M1 errors pop up but I do get a sd card init error which I don't know why because they all work.

kisslorand commented 2 years ago

opening a PR request to have that changed either way is probably a good idea.

I don't know what to say... There are some people here who are not happy if code is increased without a solid reason. The separation of M220 and M221 is not a certainty for the bug elimination as for @mustang5269 it didn't work. Who knows? Maybe together with a TX_BUFFER_SIZE increase it can make a difference for everybody. For me, on my test rig and on my printer there was not a single M21 as long as I had the M220 and M221 separated. I repeat, it is not a certainty yet.

ACNC88 commented 2 years ago

I run off the USB drive and I still have the problem. What files are you guys using to compile your own firmware, the V2.0 ones? There is no build files in the V3.0 folder, just precompiled FW.

mustang5269 commented 2 years ago

I run off the USB drive and I still have the problem.

What files are you guys using to compile your own firmware, the V2.0 ones? There is no build files in the V3.0 folder, just precompiled FW.

https://github.com/bigtreetech/Marlin/tree/SKR-mini-E3-V3.0-G0B1

Guilouz commented 2 years ago

opening a PR request to have that changed either way is probably a good idea.

I don't know what to say... There are some people here who are not happy if code is increased without a solid reason. The separation of M220 and M221 is not a certainty for the bug elimination as for @mustang5269 it didn't work. Who knows? Maybe together with a TX_BUFFER_SIZE increase it can make a difference for everybody. For me, on my test rig and on my printer there was not a single M21 as long as I had the M220 and M221 separated. I repeat, it is not a certainty yet.

I use #define TX_BUFFER_SIZE 64, so i think buffer is not the cause. Same configuration in Marlin and for a very long time without problems and the errors appeared suddenly following many PRs. Something has changed but what? That is the question...

digant73 commented 2 years ago

@Guilouz Try to use the same value for:

define BUFSIZE

define TX_BUFFER_SIZE

E.g.

define BUFSIZE 16

define TX_BUFFER_SIZE 16

The speed on TFT was improved and that could be the reason of the issue in Marlin. The TFT fw has no bug.

ACNC88 commented 2 years ago

@digant73 I have just compiled the FW with buffers increase to 16 bits, printing as I write and will let you know how it goes....already printet 2 test cubes with no stops.

kisslorand commented 2 years ago

Guys, the issue not appearing after a few hours doesn't mean anything. I had my rig powered on for 60 hours without any problems, yet for @mustang5269 the patch didn't work. Some of you might remember the bug when during print the flow rate went down to 10%. It was the same dang M221! Now it strikes back in another form.

The speed on TFT was improved and that could be the reason of the issue in Marlin.

I highly doubt that. I use my MKS TFT at this speed (72MHz) for over a year now and had no such issue and my test rig uses BTT TFT35 and I do not remember any speed improvements for that one.

The TFT fw has no bug.

How do you know that?

mustang5269 commented 2 years ago

I've left your fix and put the buffer values back to what BIGTREETECH had for stock and no issues so far. 🤷‍♂️

kisslorand commented 2 years ago

And what are those stock values?

mustang5269 commented 2 years ago

image

kisslorand commented 2 years ago

So... I had a look at Marlin regarding this disputed TX_BUFFER SIZE... That setting in modern motherboards doesn't really do much, only if it set to high values (128, 256). The motherboards use their own, hardcoded settings. For example BTT SKR mini E3 V3.0 uses 1024 for its serial TX buffer, regardless of what you set TX_BUFFER_SIZE. SKR v2 doesn't account for TX_BUFFER_SIZE unless it is higher than 64 (128, 256). Just compile your Marlin with different TX_BUFFER_SIZE and check how much extra RAM is used, you'll be surprised that to a certain value there's no increase.

mustang5269 commented 2 years ago

So...

I had a look at Marlin regarding this disputed TX_BUFFER SIZE...

That setting in modern motherboards doesn't really do much, only if it set to high values (128, 256). The motherboards use their own, hardcoded settings.

For example BTT SKR mini E3 V3.0 uses 1024 for its serial TX buffer, regardless of what you set TX_BUFFER_SIZE.

SKR v2 doesn't account for TX_BUFFER_SIZE unless it is higher than 64 (128, 256).

Just compile your Marlin with different TX_BUFFER_SIZE and check how much extra RAM is used, you'll be surprised that to a certain value there's no increase.

Good info. Ya got me then.

ACNC88 commented 2 years ago

bumping the buffers to 16 bits did nothing for me, M1 error popped up after a 2h print was done with the machine idle. I did kisslorand's changes to the TFT FW, printed a cal cube and t, hen let the printer sit overnight. No errors yet, now printing another 2h job. i will let you know how that goes! PS: left the buffers at 16bit, since this should make any difference, right?

V1EngineeringInc commented 2 years ago

I should be able to give this a shot pretty soon, fingers crossed and pretty hopeful.

ACNC88 commented 2 years ago

2h job done, 1h idle and now midway through second job without random stops...let's wait and see....

ACNC88 commented 2 years ago

Still no errors after 6+h printing and perhaps 10h idling. I never went this far before. Maybe the end of a nightmare? Thanks @kisslorand. Still it remains a mystery why the SKR mini V2.0 has no issues handling M220/M221 together....

2CAA8EE6D7B7_1642447690291

BradisBowser commented 2 years ago

More than 48 hours with my test rig on with M220 and M221 separated and not a single hiccup... Anyone else (especially those who complained) cares to join testing the patch??? I am not doing it for myself... There will be no patch made into the FW on my behalf until there's no feedback from others.

By the way can someone from BIQU step in to help too?

By the way, anyone who complained can step into and try the suggested patch ?

For those who are bothered by this bug and cannot make themself the necessary changes, just let me know what TFT you have and I'll compile a FW for you.

I've been struggling with this a bit. Any chance I could take you up on this offer? I'm running TFT35 E3 V3 with SKR mini E3 V3.

kisslorand commented 2 years ago

Sure. PR #2325 and PR #2291 are also included.

BIGTREE_TFT35_V3.0_E3.27.x.zip config.zip language_en.zip

ACNC88 commented 2 years ago

So towards the end of my second print, the printer came to a halt, similar to what happens with the M1 stop, but without any dialog box coming up. Impossible to continue. I pushed the pause button but nothing happened. I'm totally fed up with this bug!

BradisBowser commented 2 years ago

Sure. PR #2325 and PR #2291 are also included.

BIGTREE_TFT35_V3.0_E3.27.x.zip config.zip language_en.zip

Big thanks for all the work on this. New firmware is installed and starting an overnight print. Will report back any news.

V1EngineeringInc commented 2 years ago

Well I am about 12 hours in on two printers and so far no SD card notices and no M1 pauses. Normally I surely would have seen at least one by now.

Seeing these new comments, I am very interested to see if they have errors overnight.

BradisBowser commented 2 years ago

Printer ran all night with no stops (during print or at idle). Starting another 11hr project now and feeling I'm optimistic.

radek8 commented 2 years ago

@kisslorand Your edit seems to be working. Although a few users still report this error. For some, it solves the problem completely, for others it at least prolongs the interval of its occurrence. Most users will agree to merge this adjustment with the master branch.

Personally, I think the main problem is on the Marlin side, where incoming data is lost on the serial port. It can also be seen that most of the victims are on SKR-2 But it's just a guess.

elgroso52 commented 2 years ago

I just tried with your new update via octoprint connected to the motherboard(SKR miniE3V3) , After 10 minutes "M1 STOP". I now try the same printing via the USB port of the TFT 35 E3V3 screen...

kisslorand commented 2 years ago

@radek8 I am into building/programming an arduino to monitor the serial wires between the TFT and mainboard, a HW monitor. I will not rest until I have a definitive answer who's responsible, the TFT or the mainboard. I started to feel that my "patch" just avoids the bug, not eliminates it because I was able to avoid M1/M21 in other ways too.

V1EngineeringInc commented 2 years ago

I'm willing to test your theory's!

In 27 hours in on two printers with just the command separation fix, still good. a little more printing than idle, but they have not been power cycled.

V1EngineeringInc commented 2 years ago

A friend has a logger that he ran a few weeks back. All sorts of stuff was happening, here a capture that bothered us.

https://us2.dh-cdn.net/uploads/db5587/original/3X/d/9/d9124d7a20cf062aee7b0f228a5d9f7738eb8339.png

kisslorand commented 2 years ago

It seems the mainboard had hard times calculating the ARC. What MB is it?

V1EngineeringInc commented 2 years ago

SKR Pro, to be fair I don't remember that Gcode, but we run CNC's up to 8' so some of those segments can be very long. Could it just be travel time and/or full buffer? ( I know very little about how the board actually handles this communication).

kisslorand commented 2 years ago

Ah! It figures. G3 is the gcode for ARC segments, if they were very long than the RX buffer got full hence the "Busy..." message.

V1EngineeringInc commented 2 years ago

We figured as long as it gets a busy signal, it keeps the communication open and does not time out.

kisslorand commented 2 years ago

As you can see, new gcodes are sent only after an "ok" acknowledgement.

We're getting offtopic...