Weird jerkiness issue with large gcode files

hchahine commented 7 years ago

I imported image, converted to gcode and then ran the job. I noticed that laser scanning feed speed of the raster was inconsistent. As in the speed was changing a lot for a single X line scan. The gcode is only changing x/y position and spindle during these moves so I don't expect a speed change. Only acceleration and deceleration at the ends - the feedrate is fixed at 1200.

If I manually cut back the number of lines in the gcode file to be somewhere around 100k lines, the issue goes away. As in, I go open gcode in editor and just delete maybe 60% of the lines.

Not sure what is happening but I suspect it's on Laserweb side as I don't see it when using generic gcode sender and send the same file. It could be a raspberry pi limitation also - I'm not sure what's going on under the hood.

Note: The pi is headless so I'm using another PC browser to access pi via LAN wifi.

Laser mode is on in Grbl ($32=1) and I am using M4 laser control. Laserweb: Running on a raspberry pi 2. Grbl v1.1e: Running on arduino nano Machine: El-cheapo chinese 2.5W laser.

Can someone see if they can repeat and/or shed some light. penguin.gcode.txt

cheton commented 7 years ago

One relevant issue here: https://github.com/cheton/cnc/issues/108#issuecomment-270329141

Push and pop are not constant time operations, but are taking a time proportional to the array’s length ( O(n) ), and memory fragmentation.

You will get obvious performance drop on RPi and RPi2 when dealing with large array. You can use an index variable instead of calling array's push(), shift(), and unshift() methods, this will be much faster for a large array.

https://github.com/LaserWeb/LaserWeb3/blob/master/server.js#L698

// Queue
function addQ(gcode) {
    gcodeQueue.push(gcode);
}

function jumpQ(gcode) {
    gcodeQueue.unshift(gcode);
}

https://github.com/LaserWeb/LaserWeb3/blob/master/server.js#L725

code = gcodeQueue.shift().replace(/\s+/g, '');
   :   :   :
gcodeQueue.unshift(gcode);

ghost commented 7 years ago

@cprezzi

hchahine commented 7 years ago

Yep, looks like the same issue

cprezzi commented 7 years ago

@cheton: Do you have an elegant replacement for push, shift and unshift? They are just so convenient ;)

cprezzi commented 7 years ago

@hchahine : Does stuttering go away if you slow down your feed? What's the maximum feed you reach without stuttering?

cheton commented 7 years ago

@cprezzi: I managed to use an index variable this.state.sent to store current position and no longer use shift() and unshift() methods to shift an element off the beginning of a large array, because shift and unshift are O(n) operations and will take a time proportional to the number of the array length:

https://github.com/cheton/cnc/blob/master/src/app/lib/sender.js#L135

while (this.state.sent < this.state.total) {
    // Remove leading and trailing whitespace from both ends of a string
    sp.line = sp.line || stripComments(this.state.lines[this.state.sent]).trim();

    // The newline character (\n) consumed the RX buffer space
    if ((sp.line.length > 0) && ((sp.dataLength + sp.line.length + 1) >= sp.bufferSize)) {
        break;
    }

    this.state.sent++;
        :    :   :
}

You can do similar mechanism in your code, which might looks something like this:

while ((gcodeSent < gcodeQueue.length) && !blocked && !paused) {
    gcode = (gcodeQueue[gcodeSent] || '').replace(/\s+/g, '');
    spaceLeft = grblBufferSpace();
    gcodeLen = gcode.length;
    if ((gcodeLen + 1) <= spaceLeft) {
        gcodeSent++;
        grblBufferSize.push(gcodeLen + 1);
        port.write(gcode + '\n');
        lastSent = gcode;
        writeLog('Sent: ' + gcode + ' Q: ' + gcodeQueue.length + ' Bspace: ' + (spaceLeft - gcodeLen - 1));
    } else {
        blocked = true;
    }
}

cheton commented 7 years ago

The grblBufferSize array is usually small, so you don't need to care about it.

cprezzi commented 7 years ago

Thanks. I'll try it and will do some measurements.

cheton commented 7 years ago

Note that it is not easy to track performance issue on a powerful desktop or laptop, but it can be easily reproduced on RPi2. You can try sending his g-code file on RPi2, and make sure you can reproduce the reported problem.

cprezzi commented 7 years ago

Sure I will test with Raspi2 (and my old notebook, which allready has hanging frontend when running lager jobs).

hchahine commented 7 years ago

@cprezzi Ok, I've had a chance to try some things out.

Varying the feedrate without changing the number of gcode lines did have some effect on the stuttering. Higher feedrate = more stuttering, lower feedrate = less stuttering. When it got down to around F200, it appeared to go away BUT I think the problem is just kinda masked by the slow feedrade. I think it is still kinda there as it is mostly a function of the number of gcode lines.

Also, after loading a large gcode file, waiting for it to load completely and then clicking on "run gcode" there is a pause of a few seconds between when the button is clicked and when the laser starts to move. When running a gcode file where the no. of gcode lines are low enough, there is no pause whatsoever. This is how I can tell whether a new job is prone to stutter or not.

cprezzi commented 7 years ago

@hchahine The slower feed was just a test. You could expect to reach about 15-20mm/s for raster pictures and >100mm/s for vector data with longer lines on a Arduino Uno/Nano with Grbl 1.1e.

At the time you click "run gcode" the whole gcode file is transferred to the server (via websocket) before the server starts runing the job. That causes the delay.

hchahine commented 7 years ago

I would expect the file transfer delay to be proportional to the size of the file. It's doesn't seem to be. It's more of a boolean type behaviour of being either really quick (e.g. sub 1 sec) or really long (e.g. many seconds) based on some threshold for file size/no. of lines.

As I said earlier, the bottleneck is not the Arduino as the above file does fine using other gcode sender software. Issue is more related to the quantity of instructions within a file rather than the instruction execution rate. At some point, when reducing the number of total instructions the issue goes away completely while no change is made to the federate (eg. Instruction execution rate).

cprezzi commented 7 years ago

There is a big difference between a gcode-sender running a local file and our server running a file from an array in the memory (especially on slow hardware with little memory). The gcode-sender just reads line by line from the file and doesn't need much memory.

The problem you see is a performance problem caused by the slow speed and little memory of Raspi 2. Did you also try with Raspi 3?

hchahine commented 7 years ago

Haven't got a pi3 on hand to test - can get hands on one though if it might help.

The PI isn't doing much else other than serving Laserweb3 so CPU doesn't really spike above 15℅. It's not even driving Laserweb3 - this is happening on another PC. As for memory usage, I haven't checked.

So, we send the whole gcode file to the server in one go (on run gcode command) rather than stream to a FIFO located on server? This file is written to an array right? Why don't you instead just write to a temp local server side file and then parse this file locally? Then it'll be like any other gcode sender.

I think @cheton suggestion should work. I'm happy to try it out? Do I just mod the JS file or do I need to recompile anything to test this?

cprezzi commented 7 years ago

Temporarily saving the file localy on the server would be an option, but complicates handling and would slow down the process on faster hardware. I will rather try to optimize the code but stay with the array.

cprezzi commented 7 years ago

If you like to test yourself, you could just change the server.js file and restart the server with "node server.js".
I will test some optimizations (including cheton's suggestion) and update the master branch if it works.

tbfleming commented 7 years ago

Push and pop are not constant time operations, but are taking a time proportional to the array’s length ( O(n) ), and memory fragmentation.

push has an amortized constant time. pop is almost free; it just has to adjust the array's length field. V8 uses the same approach as C++'s vector; see http://cs.stackexchange.com/questions/9380/why-is-push-back-in-c-vectors-constant-amortized for the hairy math.

shift and slice are a different story; they're convenient, but they have a penalty.

ghost commented 7 years ago

FWIW, we are seeing this same problem with LW3 on PC controlling a smoothie. The larger the raster image, the slower the maximum smooth feed-rate becomes.

An image 50x30mm and 0.2mm pixel size will smoothly raster at 60mm/sec. The same image at 500x300mm and all other settings the same will raster smoothly at 20mm/sec.

tbfleming commented 7 years ago

@DarklyLabs what do you get with other senders running on Windows?

ghost commented 7 years ago

@tbfleming Just ran some tests. Problem doesn't occur with Pronterface sender under windows. Tested the same file on both LW3 and Pronterface on the same computer. Pronterface streaming is smooth, LW3 streaming is rough.

cprezzi commented 7 years ago

@DarklyLabs: The main difference between pronterface and lw is, that lw is a client / server solution and pronterface is a local application. Pronterface just reads the file line by line, LaserWeb reads the whole file on the client (browser) and sends it to the server, which holds it in ram to stream it to the machine. The longer the file is, the more ram is used and the slower it gets. I have some ideas to obercome that problem, but that woud need a lot of changes and therefore good testing.

ghost commented 7 years ago

@cprezzi I am not understanding why the feed rate achievable is affected by the size of the file. If it is all stored and contained in ram then how would the size affect playback speed? The controller is just expecting a constant stream of data which should be achievable if it is being fed from ram.

ghost commented 7 years ago

@cprezzi @openhardwarecoza We have been working a little with the Smoothie team to try and get to the bottom of our jerky-raster-movement problems. You can see the thread here: https://github.com/Smoothieware/Smoothieware/issues/1096#issuecomment-274901021

We have finally managed to get some smooth rastering results on a Windows machine using a python script Jim created. Streaming via this script we were able to smoothly stream a large gcode file at around 80-100mm/sec.

Here is the file we were testing with:

test_416x288_100mmps0p2_horiz.gcode.zip

We tried the same file on LW3 within windows and OSX but could not achieve the same smooth results.

Perhaps this method could shed some light or reveal some ideas on how LW can achieve the same result?

Here are the python file details: https://github.com/Smoothieware/Smoothieware/blob/edge/fast-stream.py usage is...

python fast-stream.py time-test.g /dev/ttyACM0 -q time-test.g would be the name of the gcode file you want to send.

cprezzi commented 7 years ago

@DarklyLabs Just a hint about your testfile: Using 80dpi with 0.2mm laser diameter makes no sense! 80dpi means a pixel size of 0.3175mm. By using it with 0.2mm laser diameter, whitch means 0.2mm pixel size in gcode, you just blow up the gcode size for no reason (and will loose sharpness). The minimum dpi for 0.2mm laser diameter is 127 (25.4 / 0.2). Whith lower dpi settings you lose sharpness and blow up the gcode size.

ghost commented 7 years ago

@cprezzi We are using the DPI setting to achieve the required engraving size as there appears to be no other method currently in LW3 nor raster-to-gcode. Please understand these files we are using are for testing movement smoothness and feed-rate. They are not so much about the quality of the final engraving.

Also, if our laser spot size is 0.2mm, using your calculated pixel size of 0.3175mm will result in gaps between the individual engraving lines.

There will always be a discrepancy between the 'perfect' resolution and setting for engraving. Customers will have pictures they want to engrave at various sizes and may not have the expertise to perform the required manipulations in a graphics package.

cprezzi commented 7 years ago

@DarklyLabs : You are right, customers probably don't understand why dpi settings change the picture size of their raster image. So either we have to document that better, or just take the laser diameter as the pixel size and give the user the ability to change the size by a % value (instead of dpi).

cprezzi commented 7 years ago

By the way: I have made a lot of speed tests yesterday. It seems that I am not able to reach anything more than about 250-280 lines/s with nodejs and node-serialport (on Windows10).

I am investigating now, if there is any faster way to do serial comminucation in node js. I hope you understand, that it's not an realistic option to use anything else than node js (over short term).

ghost commented 7 years ago

These limitations are unfortunate but we understand you are working on finding a solution. Are there any options to 'call' a more efficient streaming program, such as Jim's python code, through node.js?

cprezzi commented 7 years ago

One option would be to implement the whole server in python, which I will not do as I'm not a python developer. Somebody else could probably do...

arthurwolf commented 7 years ago

You'd need to setup a simple http or websockets server on top of jim's serial script. Doing the basic program isn't that difficult, there are plenty of examples to start from, but it's likely getting it to actually be as efficient as possible ( and stable ) will require quite a bit of skills.

On Fri, Jan 27, 2017 at 11:23 AM, Claudio Prezzi notifications@github.com wrote:

One option would be to implement the whole server in python, which I will not do as I'm not a python developer. Somebody else could probably do...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275633665, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGpFRXWdpwc7cX8ibOQHSE_jcQpLwMyks5rWcWTgaJpZM4LjrP3 .

-- Courage et bonne humeur.

jorgerobles commented 7 years ago

I don't think python have much better performance than node.js, a faster alternative should be a c++ nodejs module (which I don't know to do either)

El 27 ene. 2017 11:23 a. m., "Claudio Prezzi" notifications@github.com escribió:

One option would be to implement the whole server in python, which I will not do as I'm not a python developer. Somebody else could probably do...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275633665, or mute the thread https://github.com/notifications/unsubscribe-auth/ABoIYKnwjO8rxBuYRTK4X9CaTe4gI61Iks5rWcWTgaJpZM4LjrP3 .

arthurwolf commented 7 years ago

Well the point is that we know that the python script jim did works much better. Maybe it's possible to get nodejs to work as well, or maybe nodejs's serial library has some limitation, at this point we don't know.

On Fri, Jan 27, 2017 at 11:26 AM, jorgerobles notifications@github.com wrote:

I don't think python have much better performance than node.js, a faster alternative should be a c++ nodejs module (which I don't know to do either)

El 27 ene. 2017 11:23 a. m., "Claudio Prezzi" notifications@github.com escribió:

One option would be to implement the whole server in python, which I will not do as I'm not a python developer. Somebody else could probably do...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275633665 , or mute the thread https://github.com/notifications/unsubscribe-auth/ ABoIYKnwjO8rxBuYRTK4X9CaTe4gI61Iks5rWcWTgaJpZM4LjrP3 .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275634295, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGpFat582wVQ7TffElZU9flcW9hISYKks5rWcZtgaJpZM4LjrP3 .

-- Courage et bonne humeur.

ghost commented 7 years ago

Not to mention porting all the other code into that we already have (grbl+tinyg buffers, jog/lasertest/overrides controls, and the rest of the 90% of server.js)

On Jan 27, 2017 12:25 PM, "Arthur Wolf" notifications@github.com wrote:

You'd need to setup a simple http or websockets server on top of jim's serial script. Doing the basic program isn't that difficult, there are plenty of examples to start from, but it's likely getting it to actually be as efficient as possible ( and stable ) will require quite a bit of skills.

On Fri, Jan 27, 2017 at 11:23 AM, Claudio Prezzi <notifications@github.com

wrote:

One option would be to implement the whole server in python, which I will not do as I'm not a python developer. Somebody else could probably do...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275633665 , or mute the thread https://github.com/notifications/unsubscribe- auth/AAGpFRXWdpwc7cX8ibOQHSE_jcQpLwMyks5rWcWTgaJpZM4LjrP3 .

-- Courage et bonne humeur.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275634016, or mute the thread https://github.com/notifications/unsubscribe-auth/AHVr28HzF3zJIHYpfU2AEKmr_PeAWL8dks5rWcYJgaJpZM4LjrP3 .

cprezzi commented 7 years ago

@arthurwolf It's not at all that easy as you write! And please stop calling it jim's script, he just copied it from Sonny Jeon. As Peter wrote, we have so much logic in the server, that it woud be a bigger project to port all that to python and there is no guarantie that it would be any faster, because python is known to be slower in general, thanks to the just-in-time-compiler of nodejs, which python doesn't have. It just seems that serial communication is faster in python, but not the rest. Each programming language has some pros and cons and I think nodejs is a good choice.

cprezzi commented 7 years ago

By the way: Our Server Websocket Interface (API) is open, so you can develop your own "server" to use with LW frontend, if you like.

arthurwolf commented 7 years ago

On Fri, Jan 27, 2017 at 11:49 AM, Claudio Prezzi notifications@github.com wrote:

@arthurwolf https://github.com/arthurwolf It's not at all that easy as you write!

I just meant that adding an http server on top of the python script we currently know is fast, is easy compared to the rest of the work that will need to be done to it so it can be a replacement to the current nodejs stuff ...

And please stop calling it jim's script, he just copied it from Sonny Jeon.

He derived it from Sonny's script and made improvements that made it faster when communicating with Smoothie : https://github.com/Smoothieware/Smoothieware/blob/edge/fast-stream.py So I call it "Jim's script" because I'm talking about that specific script and no other script ... I just want people to know what script I'm talking about ...

As Peter wrote, we have so much logic in the server, that it woud be a bigger project to port all that to python and there is no guarantie that it would be any faster, because python is known to be slower in general, thanks to the just-in-time-compiler of nodejs, which python doesn't have. It just seems that serial communication is faster in python, but not the rest. Each programming language has some pros and cons and I think nodejs is a good choice.

I was just pointing out what would be needed if somebody wanted to explore going that way, not telling anyone to do anything ...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275638777, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGpFTSlsltqQEtSuQQtnEIp4gjd7Wt5ks5rWcvGgaJpZM4LjrP3 .

-- Courage et bonne humeur.

kaosat-dev commented 7 years ago

I am not sure I followed everything, but can the issue be isolated 100% down to node-serialport ? There have been numerous discussions over the years about performance bottlenecks, some of which might help , like : https://github.com/EmergingTechnologyAdvisors/node-serialport/issues/402

arthurwolf commented 7 years ago

Just for a bit of a history, if you haven't followed everything : There was an issue open on Smoothie's github ( https://github.com/Smoothieware/Smoothieware/issues/1096 ) to investigate more "scientifically" the issues Darklylabs have had with their machines "slowing down"/"jerking", and two discoveries were made :

Speed varies from OS to OS, which points at possible problems with the OSes' serial drivers
A script was adapted, in python, which was able to achieve much better speeds, which points at possible problems with the way the G-code is sent

It's not yet completely clear exactly where the problem lies, and it's very possible it's the interaction of several issues, and that those interracitons are different in different setups. It's quite usual for problems that are talked about for this long, to be due in fact to several problems.

On Fri, Jan 27, 2017 at 1:53 PM, Mark Moissette notifications@github.com wrote:

I am not sure I followed everything, but can the issue be isolated 100% down to node-serialport ? There have been numerous discussions over the years about performance bottlenecks, some of which might help , like : EmergingTechnologyAdvisors/ node-serialport#402 https://github.com/EmergingTechnologyAdvisors/node-serialport/issues/402

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-275658868, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGpFf_k8os-jc6Km_meKmjCq24In3brks5rWei7gaJpZM4LjrP3 .

-- Courage et bonne humeur.

ghost commented 7 years ago

It's great to see these conversations happening and so many diversely skilled people working together. Everyone's end goal is to make an amazing program that users will be able to make their ideas come to life.

cprezzi commented 7 years ago

@DarklyLabs Thanks to the donation of a Cohesion3D Remix Board from Ray Kholodovsky I was able to make a lot of smoothie tests over the last days. I was able to stream ave. 650 lines/s, but had some USB hangups (GetOverlappedResults) from time to time.

To test I made a nodejs gcode-streamer, which reads from a gcode-file. You find it at https://github.com/LaserWeb/LaserWeb3/blob/speed_tests/test-server.js. Use it like node test-server.js COM3 file.gcode.

Can you please test the speed with test-server.js and tell me the results?

ghost commented 7 years ago

Thanks @RayKholo !!!!!

On Jan 30, 2017 2:10 PM, "Claudio Prezzi" notifications@github.com wrote:

@DarklyLabs https://github.com/DarklyLabs Thanks to the donation of a Cohesion3D Remix Board from Ray Kholodovsky I was able to make a lot of smoothie tests over the last days. I was able to stream ave. 650 lines/s, but had some USB hangups (GetOverlappedResults) from time to time.

To test I made a nodejs gcode-streamer, which reads from a gcode-file. You find it at https://github.com/LaserWeb/LaserWeb3/blob/speed_tests/ test-server.js. Use it like node test-server.js COM3 file.gcode.

Can you please test the speed with test-server.js and tell me the results?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-276047592, or mute the thread https://github.com/notifications/unsubscribe-auth/AHVr26xxb_7_EV-3m43smTBinJXqeBYWks5rXdMwgaJpZM4LjrP3 .

ghost commented 7 years ago

@cprezzi Back at the lab today. Will test and advise. Thanks!

ghost commented 7 years ago

@cprezzi Running into some problems running the test-server.js using the command node test-server.js COM3 file.gcode

Does it need to be in a certain location or have any other commands run before hand? Apologies in advance if its something simple.

The error is: Error: Cannot find module './config' at Function.Module._resolveFilename (module.js:469:15) at Function.Module._load (module.js:417:25) at Module.require (module.js:497:17) at require (internal/module.js:20:19) at Object. (C:\Users\Domenic\Documents\Sync\Temp_rasterTests\tes t-server.js:2:14) at Module._compile (module.js:570:32) at Object.Module._extensions..js (module.js:579:10) at Module.load (module.js:487:32) at tryModuleLoad (module.js:446:12) at Function.Module._load (module.js:438:3)

ghost commented 7 years ago

@DarklyLabs clone this branch https://github.com/LaserWeb/LaserWeb3/tree/speed_tests then inside it, do npm install, then run test-server.js ... (config.js is a file in that branch (speed_tests)

ghost commented 7 years ago

Ok, got further with that info. Smoothieware detected (edge..........) Queue: 21240

Then no movement. CMD prompt just sits there.

Running on Win10

cprezzi commented 7 years ago

@DarklyLabs Sorry, I forgot to exactly declare what is needed. As Peter explained, it's easyest to use the speed_test branch.

You should then be able to stream the short.gcode file with the command node test-server.js COM3 short.gcode, or the long.gcode file with node test-server.js COM3 long.gcode.

I have deactivated most of the console logs, because they also slow the process down, but you should get a Done: x of y (ave. z lines/s) about every 100 gcode commands. And your machine schould move ;)

ghost commented 7 years ago

@cprezzi This is the process we were using. With the short and long.gcode files it detects smoothieware, reports the queue size and then sits there without any machine movement. COM3 port is correct for us. Anything else we might be missing?

cprezzi commented 7 years ago

@DarklyLabs Please do a git pull again.
I found the problem and added an optional param for feedOverride ;)

Unfortunately I was only able to reach about 550 command/s with this version.

ghost commented 7 years ago

Have we done a test yet putting a FTDI on the UART headers and streaming into that (same usb-serial as grbl, not via the mbed usb stack in other words)

On Feb 2, 2017 10:15 PM, "Claudio Prezzi" notifications@github.com wrote:

@DarklyLabs https://github.com/DarklyLabs Please do a git pull again. I found the problem and added an optional param for feedOverride ;)

Unfortunately I was only able to reach about 550 command/s with this version.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LaserWeb/LaserWeb3/issues/219#issuecomment-277070328, or mute the thread https://github.com/notifications/unsubscribe-auth/AHVr2yQ6Z-aTuUs1EDsDUL9wrl--49lPks5rYjlngaJpZM4LjrP3 .

LaserWeb / deprecated-LaserWeb3

Weird jerkiness issue with large gcode files #219