markwal / OctoPrint-GPX

An OctoPrint plug-in to use GPX as the protocol layer underneath rather than replacing g-code to talk to s3g/x3g machines, for example, a FlashForge.
GNU Affero General Public License v3.0
104 stars 25 forks source link

Sometimes OctoPrint waits forever for heatup after a cancel and start a new job. #26

Open markwal opened 7 years ago

markwal commented 7 years ago

See foosel/OctoPrint#1433

Quote from @sfsdfd:

Okay - I have results. It's a slightly different scenario than I described above, but similar behavior.

Check this out:

http://pastebin.com/QQLGAWck

I loaded a print and then canceled it because I wanted to print it a little differently. I uploaded the new print, and then clicked Print - and... OctoPrint just sat there. It changed the status from "Operational" to "Printing," and did nothing else.

I restarted OctoPrint and then loaded and printed the same model, and got this instead:

http://pastebin.com/pc2Tbh1R

...which is running as expected.

In general, OctoPrint seems to have some non-deterministic behavior when you cancel a print, either via OctoPrint or via the printer:

Sometimes it works OK. Sometimes (frequently), it does this zombie-state thing the next time you ask OctoPrint to print anything. Sometimes, when OctoPrint starts the next print, it actually reports that the print is progressing. The status changes to Printing, and layers fly by very quickly. But when you look at the actual printer... nothing is happening; it's sitting there in an idle state. Seriously, this has happened maybe five times. Very weird. I'll definitely send terminal tab contents the next time I catch it in the act. Once - yesterday - I canceled the print via the printer LCD, and my bed and nozzle returned to their idle positions... yet, the nozzle kept twitching and extruding in mid-air. Apparently, OctoPrint hadn't gotten the message, and was still streaming instructions to it after the print was canceled.

sfsdfd commented 7 years ago

Thanks, Mark.

I'll admit that I'm uncertain of the delineation of responsibilities between GPX and OctoPrint. I presumed that GPX was a straightforward translation pipeline: (GCode instructions) <--> (X3G instructions). But your action in taking ownership of this issue suggests a more substantial role, something involving the actual interpretation of input.

Let me know if any other information would be of use in diagnosing these problems and testing any fixes. If you're looking for something specific, I'll watch for that behavior and try to capture the relevant terminal tab contents.

pearson222 commented 7 years ago

Has this been solved? I have been having the same issue for a long time and have to reboot Octoprint and delete and re-upload the file if I ever use the "Cancel" button.

markwal commented 7 years ago

Well, kind of. I've made significant changes to the cancel code in every release and there are some pending. However, it's possible that the issue you are hitting isn't really fixable because the protocol has baked into it a flaw that makes it possible for the host and the bot to get out of sync on the cancel state, but I haven't seen this problem myself on the development branch for a while now.

Of course, I didn't ever really have to reboot, delete and reupload a file though. What I'd have to do at the most extreme was turn off the printer. Turn it back on after OctoPrint disconnected and then hit the connect button. No reupload, just print.

pearson222 commented 7 years ago

Once a file is cancelled it is rendered unusable and can never be printed from no matter what is rebooted. It just sits in the file window with its file name highlighted in red. Recently when I have pressed the print button on a cancelled file it changes the status to "Printing" and it runs quickly through each layer (progress bar goes from 0%-100% in 60 seconds - Gcode Viewer runs through each layer as if printing @ 1000x print speed) all while the printer does nothing. Very weird.

markwal commented 7 years ago

But that's because you didn't turn off the printer and then wait for OctoPrint to show disconnected and then turn it back on and hit reconnect. It's not the upload that is fixing it, it's the reboot. (I know the symptom you are seeing and it isn't persisted in the gcode file, it's an in memory variable that says don't do anything until the printer confirms that the cancel is complete, just pretend like you're sending it.)

markwal commented 7 years ago

It will still show red because that just means the last attempt to print it failed. It'll still print (as long as you clear the state I mentioned).

markwal commented 7 years ago

And in the version you have, it should be sufficient to click the disconnect button and then when that succeeds click the connect button. The only time you should have to turn off the printer is OctoPrint won't disconnect via the UI.

BigE2 commented 7 years ago

I've found that it seems to be related to Timelapse rendering. I used to be able to start a print while rendering, and nothing bad happens. Now, if I cancel a print and immediately try to start it, I get that weird error as described above. But, if I wait until after the timelapse rendering is complete, it seems to not happen.

The weird error above was random enough that I can't completely say the timelapse is a problem, but it seems to be related. This is on an original RPi.

My other FFCP on an RPi3 doesn't seem to be affected by this error. Both have virtually identical setups. (I copied the SD card, then changed the network name on the second one).

My 2 cents.

stuartpb commented 6 years ago

Shouldn't it be possible to basically force the reconnection workaround by re-initializing the connection whenever a print is cancelled via OctoPrint?

markwal commented 6 years ago

@stuartpb maybe, but it has some of the same difficulty because "cancelled" complete is the thing the race is having a hard time detecting and if it disconnects too soon, it won't send the cancel code that says to turn off the heaters and the motors.