evil-mad / EggBot

Software for The Original EggBot
GNU General Public License v3.0
287 stars 140 forks source link

Failure to connect in MacOS X 10.11 El Capitan #31

Closed oskay closed 8 years ago

oskay commented 9 years ago

Due to changes in the MacOS USB stack, the EggBot extensions do not reliably connect in 10.11.

More info at: https://github.com/evil-mad/wcb-ink/issues/42

oskay commented 8 years ago

Fix in codebase as of now. https://github.com/evil-mad/EggBot/commit/60d202a022af9b05f3b48f660e0cc71b3585771b

asadowsky commented 8 years ago

So, I tried this version of eggbot.py and I got this message:

Traceback (most recent call last): File "eggbot.py", line 1394, in e.affect() File "/Applications/Inkscape.app/Contents/Resources/share/inkscape/extensions/inkex.py", line 268, in affect self.effect() File "eggbot.py", line 304, in effect self.EggbotOpenSerial() File "eggbot.py", line 1280, in EggbotOpenSerial self.serialPort = self.getSerialPort() File "eggbot.py", line 1345, in getSerialPort serialPort = self.testSerialPort( strComPort ) File "eggbot.py", line 1307, in testSerialPort serialPort = serial.Serial( strComPort, timeout=1 ) # 1 second timeout! File "/Users/aj/.config/inkscape/extensions/serial/serialutil.py", line 282, in init self.open() File "/Users/aj/.config/inkscape/extensions/serial/serialposix.py", line 289, in open self.fd = os.open(self.portstr, os.O_RDWR|os.O_NOCTTY|os.O_NONBLOCK) OSError: [Errno 2] No such file or directory: '/dev/cu.usbmodemfd13'

oskay commented 8 years ago

Did you replace the full set of files, or just eggbot.py?

(Edit: The updated file is for EggBot 2.6.0, which uses pyserial 2.7 -- You will likely need to do a manual install of 2.6.0 and then replace the eggbot.py file. We are waiting for verification that this version works before packaging it for proper release.)

asadowsky commented 8 years ago

I started clean. I wiped Inkscape from my drive, including everything in .config. Reinstalled. Confirmed that there were no Eggbot extensions showing up in the Extensions menu. I downloaded the master, tried just the extensions from there and got a serial error. Then I tried what you suggested, installing v. 2.6.0 and replacing just eggbot.py.

I got the same error as above.

Contents of my ~/.config/inkscape/extensions directory:

http://snag.gy/K3GVy.jpg

oskay commented 8 years ago

Thank you for trying that. I have not been able to reproduce this error here.

I will look into this further, and follow up here.

oskay commented 8 years ago

I thought that I had seen the OSError: [Errno 2] No such file or directory error once before-- and I have now been able to confirm and reproduce the circumstances under which I have seen it. (And so far as I can see, it does not have anything to do with El Capitan.)

I can get it to occur with a mismatched set of files in the extensions directory: With PySerial 2.7 and EggBot version 2.5 (which expects PySerial 2.6).

Please try this version of the EggBot extensions:
https://github.com/evil-mad/EggBot/releases/download/2.6.0/EggBot_extensions_v2.6.1.zip

Unzip and drag the contents of the "extensions" folder into your Inkscape extensions directory, replacing all files there. (You do not need to reinstall Inkscape.) Restart inkscape, and verify in the "*" tab of EggBot Control that you have version 2.6.1.

asadowsky commented 8 years ago

I think I did everything you suggested. The "*" tab banners 2.6.1 (http://snag.gy/wISbZ.jpg), but I'm still getting the same response when testing pen up/down:

Traceback (most recent call last): File "eggbot.py", line 1394, in e.affect() File "/Applications/Inkscape.app/Contents/Resources/share/inkscape/extensions/inkex.py", line 268, in affect self.effect() File "eggbot.py", line 304, in effect self.EggbotOpenSerial() File "eggbot.py", line 1280, in EggbotOpenSerial self.serialPort = self.getSerialPort() File "eggbot.py", line 1345, in getSerialPort serialPort = self.testSerialPort( strComPort ) File "eggbot.py", line 1307, in testSerialPort serialPort = serial.Serial( strComPort, timeout=1 ) # 1 second timeout! File "/Users/aj/.config/inkscape/extensions/serial/serialutil.py", line 282, in init self.open() File "/Users/aj/.config/inkscape/extensions/serial/serialposix.py", line 289, in open self.fd = os.open(self.portstr, os.O_RDWR|os.O_NOCTTY|os.O_NONBLOCK) OSError: [Errno 2] No such file or directory: '/dev/cu.usbmodemfa14'

oskay commented 8 years ago

Thank you for trying that. Then there is something more sinister (not uniformly reproducible) at work here. :)

I have one more thing that you might be able to test before I go back to the drawing board.

Please download EggBot Extensions 2.5, from here: https://github.com/evil-mad/EggBot/releases/download/v2.5.0/EggBot_extensions_v2.5.zip

Unzip, and from that extensions folder, copy over ONLY the serial folder to your current Inkscape extensions directory, replacing the entire folder. (This older version of pyserial is compatible with the current (2.6.1) eggbot.py file, and works fine on my computer running El Capitan.)

You should be able to test it immediately; you should not even need to restart Inkscape if you're just replacing the serial folder.

asadowsky commented 8 years ago

That did it!! Many thanks!!

oskay commented 8 years ago

Awesome!

OTOH, I was hoping to modernize this a bit. Is there a chance you might be willing to help me test a few new versions (likely not today...) to see if we can narrow down what the cause was?

asadowsky commented 8 years ago

Okay, so now we're not throwing errors but there's still some odd behavior. It seems like some commands don't complete properly, even it the system will accept a subsequent command. For example, the "pen raise/lower" command only raises the pen. If I adjust the height upwards and try again, the servo will move to lift the pen further, but it never comes back down. If I adjust the upper limit lower and try again, nothing happens at all. If I go to the manual tab and "lower pen" it will go to the lower limit as set in the setup tab, which is good, and if I start a plot, it seems to work properly (though I haven't tried an actual plot yet) but it seems like there's still some commands getting lost somewhere. And raise/lower is a pretty important setup feature, as I know you know.

But at least this is moving in the right direction.

Of course I'd be delighted to help!!!

Many thanks!

oskay commented 8 years ago

Are you referring to the Toggle pen up/down option in the setup tab? Internally, this actually uses a single command "TP" for toggle pen, which causes it to change states (Inkscape does not keep track of the pen state; the EBB is supposed to.).

Thus, if a command is being dropped there, it sounds more like it's every other interaction or something, not every other command. Could you try and verify by repeatedly moving the egg motor in one direction?

asadowsky commented 8 years ago

I can move the egg motor repeatedly in the same direction without a problem.

Perhaps there's a firmware issue here? I recently tried flashing EBF_v230 of the firmware - shall I use something different?

oskay commented 8 years ago

So far as I know, 230 is well-behaved and shouldn't cause any problems.

This all sounds more like a OS-level interface problem to me-- commands and/or queries being dropped was the initial symptom of trouble with El Capitan. Most importantly, your computer seems to be behaving differently than mine here. Which exact MacOS X version are you running. Is it 10.11.1?

asadowsky commented 8 years ago

10.11.1, yes. On a MacBook Pro (15-inch, Early 2011).

For what it's worth, every manual command (less the engraver options, of course), appears to be working as expected. Even "raise pen, turn off motors" option in the setup tab works. The only command that seems troublesome right now is "toggle pen up/down", which seems to do nothing at all.

oskay commented 8 years ago

Hmmmm... That could possibly be related to the firmware version. (I can probably check that.) It could also be the case that something is inadvertently resetting (for lack of a better term) the EBB's idea of the pen position.

oskay commented 8 years ago

I have just confirmed that I have the same issue with pen toggle on 2.3.0.

oskay commented 8 years ago

My other EggBot here (firmware v 2.0.1) does not have that issue.

oskay commented 8 years ago

@EmbeddedMan Can you please check the functionality of the TP command in 2.3.0, outside of Inkscape?

If this is strictly a software (not firmware) issue, I would expect that the toggle pen command would work (or not work) in both firmware versions.

EmbeddedMan commented 8 years ago

Will do.

On Sun, Oct 25, 2015 at 7:21 PM, Windell Oskay notifications@github.com wrote:

@EmbeddedMan https://github.com/EmbeddedMan Can you please check the functionality of the TP command in 2.3.0, outside of Inkscape?

If this is strictly a software (not firmware) issue, I would expect that the toggle pen command would work (or not work) in both firmware versions.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-150991282.

EmbeddedMan commented 8 years ago

I am running an EBB with v2.3.0 firmware. I plug it into the PC. The RC servo pulse is outputting at 1.0ms, as it is supposed to, at boot.

I then send the TP command.

The pulse lengthens to 1.33ms, moving my servo.

I then send the TP command again.

The pulse shortens to 1.00ms, moving my servo.

Each time I sent TP, I get a toggle of the servo position.

Now, this is without sending any other commands to the EBB. When Inkscape starts plotting, it will send commands that will set the raised and lowered positions of the servo. Maybe some of those commands are messing things up?

Is the above test sufficient? Can you duplicate these results? I think the EBB is doing what it is supposed to.

Let me know if I can do other tests to help narrow down the cause of this issue-

*Brian

On Sun, Oct 25, 2015 at 10:03 PM, Brian Schmalz brian@schmalzhaus.com wrote:

Will do.

On Sun, Oct 25, 2015 at 7:21 PM, Windell Oskay notifications@github.com wrote:

@EmbeddedMan https://github.com/EmbeddedMan Can you please check the functionality of the TP command in 2.3.0, outside of Inkscape?

If this is strictly a software (not firmware) issue, I would expect that the toggle pen command would work (or not work) in both firmware versions.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-150991282.

oskay commented 8 years ago

Thanks Brian, That sounds like it should be a fine test.

The current issue (as I'm seeing it) is that (as tested from within the Inkscape extension, on MacOX 10.11), the TP command causes no response in 2.3.0, but works perfectly in 2.0.1.

Can you think of anything else that might have changed between these two versions that would mean that some arbitrary piece of "terminal software" could possibly notice a difference like this in the behavior of the two firmware versions? I'm sure that the TP statement is working properly on both-- as per your test. So, something else that we're doing (to query the board, say hello, etc) is causing that particular command to not work. :P

EmbeddedMan commented 8 years ago

So this is really interesting: I can't reproduce the failure. I'm on Windows, Inkscape .91, EggBot v 2.5.0, EBB v2.3.0, and the "Toggle pen up/down" in the Setup tab works for me - it moves the servo up and down.

I wonder if there isn't more going on here - like some interaction with the OS (Mac OS in both of your cases, right?)? Hmm.

I can't think of anything else in the EBB firmware versions that would cause just the TP command not to work, but that doesn't mean there isn't something there.

Can we test on other machines to see if they show the problem or not?

*Brian

On Mon, Oct 26, 2015 at 1:55 AM, Windell Oskay notifications@github.com wrote:

Thanks Brian, That sounds like it should be a fine test.

The current issue (as I'm seeing it) is that (as tested from within the Inkscape extension, on MacOX 10.11), the TP command causes no response in 2.3.0, but works perfectly in 2.0.1.

Can you think of anything else that might have changed between these two versions that would mean that some arbitrary piece of "terminal software" could possibly notice a difference in the behavior of the two firmware versions? I'm sure that the TP statement is working properly on both-- as per your test. So, something else that we're doing (to query the board, say hello, etc) is causing that particular command to not work. :P

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151045431.

oskay commented 8 years ago

Can you please test with EggBot 2.6.1 (pre-release version)?

EmbeddedMan commented 8 years ago

Using EggBot 2.6.1, I get identical results: in other words, it appears to work perfectly for me. My servo moves up and down when I click Apply with the Toggle pen up/down selected. I also confirmed that I'm using 2.6.1 by checking in the EggBot Control "*" tab.

*Brian

On Mon, Oct 26, 2015 at 11:24 AM, Windell Oskay notifications@github.com wrote:

Can you please test with EggBot 2.6.1 (pre-release version)?

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151196130.

oskay commented 8 years ago

Great-- thanks. I'll see if I can confirm and follow up here. (Still weirded out that this depends upon the firmware version....)

oskay commented 8 years ago

@asadowsky I'd advise reverting to the earlier firmware version (e.g., 2.0.1) for the time being, to see if that clears up the issue for you. We'll continue to pursue a better solution here.

EmbeddedMan commented 8 years ago

I have set up a Linux VM with Ubuntu, installed the latest Inkscape and the EggBot 2.6.1 extensions, and I am still able to control my servo properly with the Toggle command in Inkscape.

On your systems, have you confirmed that the TP command in a terminal emulator works properly? Or doesn't work? I think that would help narrow down the issue on your machines.

On Mon, Oct 26, 2015 at 7:51 PM, Windell Oskay notifications@github.com wrote:

@asadowsky https://github.com/asadowsky I'd advise reverting to the earlier firmware version (e.g., 2.0.1) for the time being, to see if that clears up the issue for you. We'll continue to pursue a better solution here.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151327906.

oskay commented 8 years ago

In https://github.com/evil-mad/wcb-ink/issues/42, I described some pseudocode tests to try and debug the serial communication issues. If I repeat that same set of tests ("Test Procedure B") with firmware v. 2.3.0, I do not get the same dead silence ("[no output]"), instead it consistently produces the error message: !8 Err: Unknown command '?X:D358' in cases where the 2.0.1 firmware gives no output.

@EmbeddedMan, is this expected? And, do you know the cause of that particular error message?

Added: Looking at the UBW code, that error message comes from the line printf ( (far rom char *)"!8 Err: Unknown command '%c%c:%2X%2X'\r\n" ,cmd1 ,cmd2 ,cmd1 ,cmd2); So, it's just repeating back what command it thinks it has been given.

EmbeddedMan commented 8 years ago

The value ?X:D358 tells you what the EBB 'heard' from the PC. It takes in two characters (almost all commands are two characters long), and then tries to match those two characters with it's list of known commands. When you get an "Unknown command" error, it means that the two characters it heard didn't match. It 'heard' a 0xD3 and then a 0x58 (which is a 'X' character). So, for some reason, the data coming from the PC starts out with 0xD3 then 0x58, and this isn't a known command.

I'm not sure how those two bytes are getting sent to the EBB, but that's what the EBB is hearing. It should send a series of error messages like this for every pair of characters that don't match known commands. Sending a CR always causes the EBB to 'start over' with it's processing of commands, so that's a good way to clear the palette.

Does that help at all?

*Brian

On Tue, Oct 27, 2015 at 6:57 PM, Windell Oskay notifications@github.com wrote:

In evil-mad/wcb-ink#42 https://github.com/evil-mad/wcb-ink/issues/42, I described some pseudocode tests to try and debug the serial communication issues. If I repeat that same set of tests ("Test Procedure B") with firmware v. 2.3.0, I do not get the same dead silence ("[no output]"), instead it consistently produces the error message: !8 Err: Unknown command '?X:D358'.

@EmbeddedMan https://github.com/EmbeddedMan, is this expected? And, do you know the cause of that particular error message?

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151678802.

oskay commented 8 years ago

I'm honestly much happier to have a response than none here, but do you know a reason why the response (to the same test program) would be different in the two firmware versions?

EmbeddedMan commented 8 years ago

Yeah, that's a great question. Let me check.

*Brian

On Tue, Oct 27, 2015 at 8:12 PM, Windell Oskay notifications@github.com wrote:

I'm honestly much happier to have a response than none here, but do you know a reason why the response (to the same test program) would be different in the two firmware versions?

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151689173.

EmbeddedMan commented 8 years ago

Here's what I can tell. I can only easily go back to version 2.1.1, since that's the point where we cut over from SVN to git. I can compare 2.0.1 with 2.1.1, but that's going to take more time than I have right now.

So - comparing 2.1.1 with 2.3.0, I don't see any changes to the command processor that would change the way that commands are processed or the way that errors are reported. There are of course a handful of new commands.

Now, there are lots, and lots of changes between these two versions, so I can't be sure that none of these other changes might cause a difference in behavior. We changed versions of USB stack among other things, which could have changed something, although this seems relatively remote to me.

I want to be sure that I understand the problem in as much detail as possible:

Under Windows, Linux, and all versions of Mac OS before El Capitan, there is no issue - both v2.0.1 and v2.3.0 EBB firmwares work properly with no errors.

Under Mac OS 10.11, only when using v2.3.0 EBB firmware, the TP command when used from within Inkscape has no effect, and the "Test Procedure B" gives Unknown Command error, but v2.0.1 EBB does not.

Is all of that correct?

I still really think that a simple terminal emulator test on Mac OS 10.11 would be a really good thing to do right now. Does the TP command work from the terminal emulator? Can you simulate Test Procedure B from the terminal emulator and see the Unknown Command error?

*Brian

On Tue, Oct 27, 2015 at 8:47 PM, Brian Schmalz brian@schmalzhaus.com wrote:

Yeah, that's a great question. Let me check.

*Brian

On Tue, Oct 27, 2015 at 8:12 PM, Windell Oskay notifications@github.com wrote:

I'm honestly much happier to have a response than none here, but do you know a reason why the response (to the same test program) would be different in the two firmware versions?

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151689173.

oskay commented 8 years ago

Under Windows, Linux, and all versions of Mac OS before El Capitan, there is no issue - both v2.0.1 and v2.3.0 EBB firmwares work properly with no errors.

So far as I can see, yes.

Under Mac OS 10.11, only when using v2.3.0 EBB firmware, the TP command when used from within Inkscape has no effect, and the "Test Procedure B" gives Unknown Command error, but v2.0.1 EBB does not.

Yup.

I still really think that a simple terminal emulator test on Mac OS 10.11 would be a really good thing to do right now. Does the TP command work from the terminal emulator? Can you simulate Test Procedure B from the terminal emulator and see the Unknown Command error?

Working on that now.

Opening a serial connection from the terminal on the Mac to a device located at /dev/cu.usbmodem1411 can be done with the command screen /dev/cu.usbmodem1411. (Note-to-self: Use control-a followed by 'ky' to close the session and exit screen.) I'll follow up with test results shortly.

oskay commented 8 years ago

Test 1: MacOS 10.9.5, Firmware 2.0.1

Test 2: MacOS 10.9.5, Firmware 2.0.1

Test 3: MacOS 10.9.5, Firmware 2.3.0

Test 4: MacOS 10.9.5, Firmware 2.3.0

Test 5: MacOS 10.11, Firmware 2.0.1

Test 6: MacOS 10.11, Firmware 2.0.1

Analysis: This matches my prior experience, as described in https://github.com/evil-mad/wcb-ink/issues/42. The first time that we connect, everything is fine. The second time, the first command is ignored, and our workaround was to perform an additional query before trying to do anything "important."

Test 7: MacOS 10.11, Firmware 2.3.0

Test 8: MacOS 10.11, Firmware 2.3.0

Test 8: (repeated) Exactly reproducible.

Analysis: (1): The most significant difference between 2.0.1 and 2.3.0 is that 2.3.0 returns an "OK" for each valid command (rather than nothing) that is not a query, and an "Unknown command" message (rather than nothing) for each invalid command. (2): There appears to be be some additional "gunk in the buffer" when re-opening a serial connection in MacOS 10.11. It might be possible to work around this with an additional buffer-clearing step, for example.

EmbeddedMan commented 8 years ago

This is a fantastic set of tests.

But I don't see examples of you sending invalid commands - I would think that both 2.0.1 and 2.3.0 should respond in the same way to an unknown command. I can't test this right now, but is that not what you found?

*Brian

On Wed, Oct 28, 2015 at 1:16 PM, Windell Oskay notifications@github.com wrote:

Test 1: MacOS 10.9.5, Firmware 2.0.1

  • Plug in USB
  • Reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • command tp: Responds correctly.
  • command tp: Responds correctly.
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • Close terminal

Test 2: MacOS 10.9.5, Firmware 2.0.1

  • Leave USB connected; do not reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • command tp: Responds correctly.
  • command tp: Responds correctly.
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • Close terminal

Test 3: MacOS 10.9.5, Firmware 2.3.0

  • Plug in USB
  • Reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • command tp: OK (and responds correctly).
  • command tp: OK (and responds correctly).
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • Close terminal

Test 4: MacOS 10.9.5, Firmware 2.3.0

  • Leave USB connected; do not reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • command tp: OK (and responds correctly).
  • command tp: OK (and responds correctly).
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • Close terminal

Test 5: MacOS 10.11, Firmware 2.0.1

  • Plug in USB
  • Reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • command tp: Responds correctly.
  • command tp: Responds correctly.
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • Close terminal

Test 6: MacOS 10.11, Firmware 2.0.1

  • Leave USB connected; do not reset EBB
  • Open terminal (using screen)
  • command v: [no response]
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • command tp: Responds correctly.
  • command tp: Responds correctly.
  • command v: EBBv13_and_above EB Firmware Version 2.0.1
  • Close terminal

Analysis: This matches my prior experience, as described in evil-mad/wcb-ink#42 https://github.com/evil-mad/wcb-ink/issues/42. The first time that we connect, everything is fine. The second time, the first command is ignored, and our workaround was to perform an additional query before trying to do anything "important."

Test 7: MacOS 10.11, Firmware 2.3.0

  • Plug in USB
  • Reset EBB
  • Open terminal (using screen)
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • command tp: OK (and responds correctly).
  • command tp: OK (and responds correctly).
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • Close terminal

Test 8: MacOS 10.11, Firmware 2.3.0

  • Leave USB connected; do not reset EBB
  • Open terminal (using screen)
  • response returned, unsolicited: !8 Err: Unknown command 'N:4E'
  • command v: !8 Err: Unknown command '[X:5B58'
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • command tp: OK (and responds correctly).
  • command tp: OK (and responds correctly).
  • command v: EBBv13_and_above EB Firmware Version 2.3.0
  • Close terminal

Test 8: (repeated) Exactly reproducible.

Analysis: (1): The most significant difference between 2.0.1 and 2.3.0 is that 2.3.0 returns an "OK" for each valid command (rather than nothing) that is not a query, and an "Unknown command" message (rather than nothing) for each invalid command. (2): There appears to be be some additional "gunk in the buffer" when re-opening a serial connection in MacOS 10.11. It might be possible to work around this with an additional buffer-clearing step, for example.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151939488.

oskay commented 8 years ago

Was there a reason to be testing unknown commands?

EmbeddedMan commented 8 years ago

Only that you mentioned it : "The most significant difference between 2.0.1 and 2.3.0 is that 2.3.0 returns an "OK" for each valid command (rather than nothing) that is not a query, and an "Unknown command" message (rather than nothing) for each invalid command."

Since you reported it as the most significant difference between the firmware versions, but didn't explicitly show examples of how you tested that difference.

That's all.

*Brian

On Wed, Oct 28, 2015 at 1:34 PM, Windell Oskay notifications@github.com wrote:

Was there a reason to be testing unknown commands?

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151944417.

oskay commented 8 years ago

We can see the difference as follows:

It's clear that each is responding to what it perceives to be a bad command (from MacOS 10.11), but that they respond differently.

EmbeddedMan commented 8 years ago

Ahh, OK. I see. This is interesting, as you are sending a valid command "v" in both cases. In the first case, the EBB seems to receive nothing, but in the second case you get the characters "[X" sent to the EBB.

The 2.0.1 EBB should respond with "!8 Err: Unknown command" if you send it an unknown command - that's why I suspect that no bytes got to it when you sent "v".

I'm really glad to see that in all possible cases, the "TP" command performed properly.

So, what do you think should be the next steps to resolving the original issue?

*Brian

On Wed, Oct 28, 2015 at 1:42 PM, Windell Oskay notifications@github.com wrote:

We can see the difference as follows:

  • command v: [no response] (v 2.0.1)
  • command v: !8 Err: Unknown command '[X:5B58' (v 2.3.0)

It's clear that each is responding to what it perceives to be a bad command (from MacOS 10.11), but that they respond differently.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151947593.

oskay commented 8 years ago

I have now tested via terminal that both versions do indeed return !8 Err: Unknown command for an unknown command. So, I may need to re-check some of this.

EmbeddedMan commented 8 years ago

I wonder if there's some way we can passively log what's going on between the Mac OS 10.11 system and the EBB, with Inkscape running. In other words, not "here's what I sent to the EBB" but rather "here's what the EBB thinks it received". I'd like to know why sending "v" doesn't produce a response (on 2.0.1) and sends "[X" on 2.3.0. Is the Mac actually sending something different? Or is there a difference in the firmware on the EBB interpreting something that the Mac is doing differently?

A USB protocol analyzer would be able to show this difference. I have such a thing, but have never had to use it (yet).

Maybe I need to buy a new Mac system so I can test this out . . .

*Brian

On Wed, Oct 28, 2015 at 1:56 PM, Windell Oskay notifications@github.com wrote:

I have now tested via terminal that both versions do indeed return !8 Err: Unknown command for an unknown command. So, I may need to re-check some of this.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151955095.

oskay commented 8 years ago

A protocol tester would be quite helpful here. Maybe time to get one.

One more discovery: Working from an independent (Inkscape-free) python test script, I found that I could reproduce the symptom of the TP command (apparently) not working, in 2.3.0 only.

Python Test Script C does the following:

This works correctly with firmware 2.0.1. (It toggles the pen...) It does not work under 2.3.0.

It occurred to me that perhaps the issue was that we were not giving the pen time to change states before closing the port. So I tried adding a delay after the pen toggle:

Python Test Script D

This behaves just like you would expect in 2.0.1: It toggles the pen, waits 3 seconds, and ends the program. However, under 2.3.0, it does something surprising: After the three seconds, the pen goes back to the original position, regardless of whether it was initially up or down. That is to say, it undoes the toggle operation.

Analysis: This also implies that before the delay was added, the EBB received and understood the TP command, but then immediately processed the TP command a second time-- returning it to the initial state. This makes it appear to have not processed the command at all.

Mystery: Why should this depend on the firmware version?

oskay commented 8 years ago

@asadowsky I've worked from this new set of data to form a new version that appears to work on MacOS 10.11 El Capitan, with all of its quirks, and with EBB Firmware version 2.3.0.

It's here: https://github.com/evil-mad/EggBot/releases/tag/2.6.2

Please give that a try. I've put two zipped versions there, one each with pyserial 2.5 and 2.7 (both work for me).

EmbeddedMan commented 8 years ago

Windell,

These tests are fantastic. I'm certain that the clues to exactly what's going on are here.

The one thing that comes to mind is that terminal programs don't normally send multiple bytes together in one USB packet. They send one character at at time (as you type them). But Python (or Inkscape) will send a whole 'command' (like "TP") as a single USB packet. So the way that the EBB handles this is slightly different. I wonder if that's part of why we're seeing a difference.

Also, 10.11 may be doing some perfectly legal, but not very common things when it first opens up the USB port, and version 2.3.0 may respond differently than 2.0.1 since they use different USB stack versions.

What you found about the TP command seemingly being sent a second time when the port is very curious. I really wish I had a 10.11 Mac to throw my analyzer on and see exactly what's going on at a low level.

But it sounds like you may have been able to work around these problems with your latest code - I'll wait to do more debugging until we know if your workaround solves things or not.

*Brian

On Wed, Oct 28, 2015 at 4:17 PM, Windell Oskay notifications@github.com wrote:

A protocol tester would be quite helpful here. Maybe time to get one.

One more discovery: Working from an independent (Inkscape-free) python test script, I found that I could reproduce the symptom of the TP command (apparently) not working, in 2.3.0 only.

Python Test Script C does the following:

  • Open serial port
  • Query and Read version number
  • Query and Read version number
  • Check if that version number read (the second time) is good.
    • If not, close port and terminate program.
    • If so, continue.
  • Toggle pen position
  • Close serial port

This works correctly with firmware 2.0.1. (It toggles the pen...) It does not work under 2.3.0.

It occurred to me that perhaps the issue was that we were not giving the pen time to change states before closing the port. So I tried adding a delay after the pen toggle:

Python Test Script D

  • Open serial port
  • Query and Read version number
  • Query and Read version number
  • Check if that version number read (the second time) is good.
    • If not, close port and terminate program.
    • If so, continue.
  • Toggle pen position
  • Wait 3 seconds
  • Close serial port

This behaves just like you would expect in 2.0.1: It toggles the pen, waits 3 seconds, and ends the program. However, under 2.3.0, it does something surprising: After the three seconds, the pen goes back to the original position, regardless of whether it was initially up or down. That is to say, it undoes the toggle operation.

Analysis: This also implies that before the delay was added, the EBB received and understood the TP command, but then immediately processed the TP command a second time-- returning it to the initial state. This makes it appear to have not processed the command at all.

-

This behavior does not occur when testing directly from the terminal. It seems that there something about the way that pyserial works is causing this problem. But again, only on firmware v 2.3.0.

Adding a different "last" command like an additional version query allows the TP command to work again, since it does not undo the toggle.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-151992568.

oskay commented 8 years ago

While the new version (2.6.2) was working for a while, it apparently is not now. This may have to do with changes in the USB stack between 10.11 and 10.11.1.

We have now had multiple reports of serial timeout errors, which may manifest as an error of the form "Plot paused by button press [...] EBB Serial Timeout."

I have been able to reproduce the error here and it appears that the issue is due to the shorter serial timeout value in 2.6.2. Version 2.6.3 will go back to the 1 s timeout value, and it appears (in testing here) to work quite well.

EmbeddedMan commented 8 years ago

Wow. I guess I'm somewhat surprised that a change in the USB stack would affect how long commands take to leave the PC, turn around in the EBB, and return to the PC. And by affect I mean 'take longer'. You think that changes to the stack would make things faster, not slower.

I am glad that a fix for this new issue is so simple.

*Brian

On Thu, Nov 12, 2015 at 12:55 PM, Windell Oskay notifications@github.com wrote:

While the new version (2.6.2) was working for a while, it apparently is not now. This may have to do with changes in the USB stack between 10.11 and 10.11.1.

We have now had multiple reports of serial timeout errors, which may manifest as an error of the form "Plot paused by button press [...] EBB Serial Timeout."

I have been able to reproduce the error here and it appears that the issue is due to the shorter serial timeout value in 2.6.2. Version 2.6.3 will go back to the 1 s timeout value, and it appears (in testing here) to work quite well.

— Reply to this email directly or view it on GitHub https://github.com/evil-mad/EggBot/issues/31#issuecomment-156200798.

oskay commented 8 years ago

Update: Mac installer EggBot2.6.3.r1s.mpkg is released. We have multiple confirmations that this version is working well under El Capitan now.

oskay commented 8 years ago

Well, that didn't last long. (Nothing like marking an issue "solved" to get people to help with testing.)

We have one more report now of essentially the same issue that @asadowsky initially reported. (@asadowsky: If you could follow up and see if the current version is working for you, that would be helpful.)

I'm going to try the current extension version, but rolling back to pyserial 2.5, to see if that helps. (It has seen more reliable....)

oskay commented 8 years ago

Update: The version with pyserial 2.7 (EggBot 2.6.3) was apparently causing failures on some set of Macs-- we have not been able to reproduce it locally. It was causing errors such as "OSError: [Errno 2] No such file or directory: '/dev/cu.usbmodemfd14'".

Reverting to pyserial 2.5 (in EggBot 2.6.4) has fixed the problem on most computers that were experiencing that issue. So far, we do not have any hint of what is causing that error. We could really use help from someone who has that issue (in 2.6.3) to help us dig into that issue.

A separate issue has come up as well, for which we have two reports of failures under 2.6.4. In these cases, the extension apparently freezes Inkscape: http://forum.evilmadscientist.com/discussion/526/ostrich-eggbot-inscape-crash

In one of these cases, the message was accompanied by a message of the form "SPUSBDevice: IOServiceGetMatchingService did not return anything for location 0x0616000. Unable to find an EggBot on any serial port."