espressif / esptool-js

Javascript implementation of flasher tool for Espressif chips, running in web browser using WebSerial.
https://espressif.github.io/esptool-js/
Apache License 2.0
251 stars 101 forks source link

refactor transport into interface to allow extending serial implementations. #122

Open brianignacio5 opened 6 months ago

brianignacio5 commented 6 months ago

Fix #117

This PR separate the Webserial transport into an abstract class AbstractTransport and ISerialOptions for such transport class construct options.

This should enable external serial implementation such as node serialport.

PTAL @cubicap

cubicap commented 6 months ago

Thanks a lot but I do not think this solves the original problem.

To me, it seems like the class implementing the Transport interface still has too much responsibility. I believe the algorithms (e.g. slipReader, hexConvert...) implemented in the class could be separated further away, as they will be the same in different serial port implementations.

I think the Transport "interface" should only implement "raw read", "raw write", setDTR/RTS, getInfo, getPID, connect and disconnect (the current rawRead, rawWrite are not enough, they do too much to be called "raw"), ((I hope I didn't miss anything)). The rest can be moved to a new class, which would be instantiated inside ESPLoader and which wraps the Transport instance.

Another thing I would like you to consider when defining the interface - at least Chromium's webserial is quite fast and is able to set dtr/rts with low enough latency not to cause too many problems. This is not true for serial port implementations for Node.js, which have problems with inverting both of the signals at once during the reset sequence (specifically the SerialPort library has this problem).

It then causes that while the signal on the EN pin is rising (which takes a surprisingly long time), there is a short period when the boot pin is set to logical 1 as well and so the esp boots in the wrong mode (I'm not sure now which sequence it was but can investigate this further later this week).

In my fork, I've solved this problem by updating the reset strategy string and its interpretation:

This solution needs some function for setting both of the pins at once to be present in the interface. Also, I don't know, whether changing the strategy string is a viable option.

Another problem, I see with the implementation is in the _appendBuffer function (and its use). Because Uint8Array serves as a view to an ArrayBuffer instance, the internal buffer can have a different size, than what is being viewed. Because webserial seems to return data in exactly allocated buffers, the function works. This is however not true for Node SerialPort, which always seems to allocate at least something like 32 KiB. As a result, _appendBuffer which concatenates the internal buffers causes the use of uninitialized memory (and seems to generate data from thin air).

Anyways, the use of ArrayBuffer in that place does not make any sense, because Uint8Array has the same API for setting data. (I have spent an annoyingly long time trying to debug where this uninitialized data was coming from when creating my fork)

And just a little nitpick - wouldn't an interface suit the function better than an empty abstract class? Also, maybe it would make more sense to put the docstrings into the interface instead of "the one" implementation?

brianignacio5 commented 6 months ago

Thank you for your feedback.

The main reason I included these functions in the Abstract Transport class is because they are referenced in the ESPLoader class directly. I do agree that some functions are more of utility so will move those too. I think read and write should still be part of the transport because of how each transport can also have their own way of handling things.

Thanks for noticing about ArrayBuffer, will fix these asap.

About why using Abstract class is mostly because it enforce any implementation of Transport classes to implement these methods too which I think is better to keep consistency with esploader transport calls.

cubicap commented 6 months ago

Is it true, that the underlying serial port implementations can be so different, that they would not fit into the API I've outlined?

I firmly believe, that having new rawRead and rawWrite async methods, which would serve as simple "chunk reader/writer" would be enough to cover all implementations.

Suppose the underlying serial port implementation does not support direct read/write via an async function. In such a case, it can be easily transformed using a wrapper, which exposes such API ~~ also, in such a case, the current API would break as well and a similar transformation would still be needed.

While it may cause the implementation to use one more buffer than a more optimal solution, I believe that is not a problem, considering we are in a JS environment already and the data in the buffer wouldn't be stored in multiple copies anyway. On the other hand, we will get a much simpler way of implementing wrappers for new serial port implementations.

cubicap commented 6 months ago

Also, could you please comment on the section, where I'm discussing the handling of DTR/RTS in connection to the reset sequence?

brianignacio5 commented 6 months ago

The current rawRead is already async (need to return a Promise which is how async functions are defined in abstract classes).

The write function is also implemented as async with a Promise result. Is there a need for rawWrite ? Formatting is done from calling functions in exploder (readFlash and command).

I didn't think about the reset functionality because it seems to me that those src/reset.ts function are basically sequential calls to setDTR and setRTS. Assuming that a setDTR is working correctly the sequence in src/reset.ts should remain equally working no ?

We could add a way to inject these usbJTAGSerialReset and customReset functions currently used in connection attempt.

cubicap commented 6 months ago

The problem is not the fact whether the functions are async, but the fact that they have too much responsibility.

The way they are, the Transport can't implement the simple concept of "read N bytes". Instead, it has to implement "read N bytes and apply some algorithm to it" (whether it is the slip reader or some kind of buffering). This will lead to unnecessary code duplication when implementing a second Transport.

Regarding the reset, the same problem occurred not only with the reset string but with the mentioned functions, which implement the sequence directly. The problems I've encountered are caused by the configuration of the signals (in setDTR, setRTS) being too slow in node serialport (at least on Windows), which causes unexpected output during the transition phases that result in the esp ending in the wrong boot mode.

brianignacio5 commented 6 months ago

Isn't read N bytes rawRead ? I expect read needs to do some formatting of some sort anyway but I remove slipReader and slipWriter from abstract class. As you can see slip reader does have some dependency to transport implementation that is why I didn't separate like slipWriter.

Removed some of these function from abstract class and refactor reset methods and other functions.

What do you think @cubicap ?

cubicap commented 6 months ago

It kind of is but still, it does more work than it needs to do. The idea I am trying to show is to reduce the responsibility of the rawRead and rawWrite methods.

Well, the slip reader/writer may not be easily separated from the Transport to be completely independent but they can be separated into a new class which makes use of the Transport (or however we want to call it) which only serves as an abstraction layer around the port without reimplementing other stuff.

What I imagine might look something like the following image: image

Transport only implements abstraction around a serial port and nothing else. All functionality and algorithms are provided by some other class (in the diagram Slip), which makes use of the abstraction layer and presents an interface (which may be similar to the old one) to the ESPLoader.

tve commented 5 months ago

I'm also interested in using esptool-js in node.js and I'd like to echo @cubicap's comments, specifically that read and the slip stuff should be moved out of the base transport. As far as I can tell, the reason that read in is there is to be able to push read data back into the this.leftOver buffer for future consumption. If you add an unread method to the transport that does just that (essentially unread(buffer) { this.leftOver = buffer }) you should be able to move read and slipReader out of the WebSerialTransport.

github-actions[bot] commented 5 months ago

Download the artifacts for this pull request:

brianignacio5 commented 5 months ago

Sorry for late response @cubicap @tve

I've removed the read function to src/utils/slip.ts and implemented mentioned changes.

Please take a look when you can.

cubicap commented 5 months ago

Thanks a lot, the refactor looks mostly like what I had in mind.

I'd say, the methods rawRead and write should have more consistent names (both with or without the "raw" prefix).

Also, I'm still not convinced that using an abstract class instead of an interface for AbstractTransport is right, because an interface enforces the implementation of its methods as well. To me, an abstract class serves more like an incomplete base while an interface only specifies what the class/object should be capable of (which is exactly what AbstractTransport is for). ~~~you are even using the implements keyword in WebSerialTransport, which is typically used for interfaces.

The last thing is tracing -- I'm not sure, whether trace should be a method of the transport object. At this stage, it seems like something logically standalone but somehow connected through the transport. I think it would make more sense for a "tracer" to be an argument of esploader (or if the transport wants tracing, of transport as well). The separation would remove another method, which is needed for the full implementation of a transport and would make it easier to redirect traces to a different output (if someone wants to do so). I can imagine "tracer" being an object with a single method trace, which could do the same as the current trace method in WebSerialTtransport.


The problems with the reset sequence I've mentioned before will still probably persist but I'll investigate further and write a separate issue when I have the time for that.

On a separate note - I'd like to ask, whether Espressif is interested in maintaining the implementation of other selected (possibly popular) transports (like node-serialport or connection through WebUSB). It could make it more clear to users that esptool-js is usable outside of the web context and as a test that everything works when used from inside node.js.

cubicap commented 5 months ago

Oh, I've looked inside the WebSerialTransport and it seems I've missed some stuff regarding slip - while slip is mostly separate, it still partly lives inside webserial transport:

The attribute slipReaderEnabled is weird. Whether reads should be parsed using slip should be in the competence of the caller. It is definitely not the property of a transport. To me, it seems like an artefact of the older version.

The write method should also implement only a "raw" interface and transparently write data to the serial port without any processing. That should be done in the competence of the caller.

Another thing is the leftOver buffer. It does not seem like a good practice to directly access the internal state of the transport. It would be better to have a method unread on the transport, which returns data back to the buffer. The same applies to the read operation, which definitely should not directly touch the buffer (especially when there is the rawRead method).

slipReaderFormat looks more like it should be called slipReaderParse as it takes raw data, parses them and finds a packet.

My suggestion would be to create a class Slip, which takes a reference to the transport and transforms reads and writes to use the slip protocol (like the diagram I sent in December). I think it would be way cleaner than the current free functions, which keep their state in a somewhat independent object.

(a large reason for these changes is that exposing data directly means there is no clear contract as to what can be done with them, which in turn means that to implement a custom transport, the user needs a deeper understanding of how the buffer is used and how it should be managed)

brianignacio5 commented 5 months ago

Thank you for the feedback @cubicap ! Updated the changes to a Slip class and further remove from Transport layer and other changes like the tracer object into a Trace interface. the leftOver was reimplemented in @tve unread(buffer) method. Thank you @tve

Regarding the reset functions you can see that the user can override the reset functions passing reset functions objects to esploader in loaderOptions.

Please take a look guys.

About supporting other transport I would be happy to add support for them if they don't require too much maintaining but community can help. Hopefully we can add an implementation that could be use in node electron and others.

cubicap commented 5 months ago

Thank you, that looks great!

The last thing about the unread method - it serves as a kind of "buffer swap", which to me seems unintuitive. I think, in the places, where we want to just "empty" the internal buffer, using read would be more clear. For returning data back to the buffer, it would probably be more intuitive if the data was prepended to the current leftover bytes - e.g.

function leftOver(buffer) {
    this.leftOver = this.leftOver.length == 0 ? buffer : appendArray(buffer, this.leftOver);
}

I guess, this would mean the behaviour of transport's read would have to change slightly - maybe a special timeout value for reading only the internal buffer?

I think it's a bit closer to what @tve meant.

cubicap commented 5 months ago

I've completely missed the changes to reset. That looks like it should be able to fix most of the problems I've encountered - thanks for that.

I've just noticed that some of the functions like classicReset are not used anywhere inside of esptool. I think it's useful to have the named reset sequences but it might be better to express them using the customSequence to reduce logic duplication (and to make it easier to globally change their behaviour).

It might also be nice to always use the named resets inside of esploader instead of sequence strings to make the behaviour more clear?

cubicap commented 5 months ago

I'm just writing a node-serialport transport to test whether things work in node.js and am wondering whether the packet argument to read is really needed - it only has data that is prepended before the received data. Does it have to be in the read function? It seems like something that can be just as easily done by the caller. Omitting it might make things a bit more clear.

cubicap commented 4 months ago

I've just finished testing the new version with node.js and have encountered several problems:

1) the intended behavior of read is not very intuitive (e.g., even if the packet argument is larger than minData, it has to read at least something - I think the packet argument really should be removed, as I'd suggested yesterday)

2) node module resolution is somewhat different than what rollup works with, which leads to some compatibility issues

3) it would be nice to have the various resets exposed via methods of esploader (so the user can, for example, easily call hard reset)

4) most attributes of LoaderOptions are not optional, while they probably should be (the constructor of esploader treats them that way)

Just to illustrate the first point - I believe I know how esptool-js works internally and how it should be used better than an average user. Still, I've spent more than an hour debugging what I am doing differently than the webserial transport.

To make the tool portable, the json files will probably have to be loaded using a different approach. (the solution I use in my fork is transforming the jsons into ts source files, but it's not very elegant, and I'm not sure if it is even ok because of their license).


Other than that, things seem to work well.

cubicap commented 4 months ago

I had a typo in one of my comments - in the code snippet, the function name should have been unread instead of leftOver.

The current version behaves differently than the typical "unread" operation. Intuitively, it should just put the data back to the input, so they are the first bytes read by a read operation.

Please take a look at C's ungetc - it only pushes the character into the read buffer. Similar behaviour can be observed with Node.js' Readable.unshift

Such behaviour would be preferable because it is easier to imagine and more conventional.

Changing the behaviour would remove a way just to read the leftOver buffer, but I think Slip could be changed not to rely on that. Alternatively, read could have a special timeout value, with which only the buffer content is returned (and if there is not enough data, an error should be thrown to be consistent)

brianignacio5 commented 4 months ago

For some reason the re implementation of this transport layer made the chip connection very unstable. Need to go back to review what went wrong.

cubicap commented 4 months ago

Interesting... When I tried the last commit with my ESP32-S3, things seemed to work fine both in a browser and Node.js. (I did not notice any difference from before)

brianignacio5 commented 3 months ago

@cubicap I've rebase and went back a bit to update the PR. Please take a final look so we can merge.

Also @balloob Please take a look, this PR should not change the behavior but allow NodeJS projects to also use it.

cubicap commented 3 months ago

Thanks for the work you've done.

At first glance, I see a few things missing - the unread operation we discussed is not implemented and the read operation therefore still suffers from the mentioned problems.

I am also not sure about Node.js - the imports of json stubs in the "target.ts" files do not seem to work under it.

And just a nitpick regarding the reset functions interface - the ESPLoader constructor appears to check all of the reset functions for being defined (which seems sensible and allows for not specifying the defaults explicitly), however, the interface ResetFunctions does not allow for the functions to be undefined (individually).

brianignacio5 commented 3 months ago

I haven't been able to successfully refactor leftOver out of slip read function without breaking the whole connection workflow. If you manage to do so please let me know or suggest some way to do because I can't seem to fix it.

About reset function can fix this. Regarding the JSON import, I believe tsconfig.json resolveJsonModule should allow to import it in your project. Isn't that the case ? Maybe we can test with a dynamic import() ?

cubicap commented 3 months ago

I have resolveJsonModule set to true, but it still does not work.

I think the problem is that while TypeScript has the concept of json imports, ES does not. AFAIK, it should work in CommonJS modules but ES is the standard that should be targeted (or that's what I believe). Dynamic import will probably have the same problems as the "normal" one. There is a proposal for ES to add "import attributes", which should allow json imports but it's not standardized yet.

I wanted to finalize the interface as much as possible so that more breaking changes are not necessary in the near future. If it's not a problem, I can look at the leftOver and try to come up with something when I have more time (which, to be honest, does not seem to be very soon).