christiansandberg / canopen

CANopen for Python
http://canopen.readthedocs.io/
MIT License
439 stars 196 forks source link

Slave node support #47

Open christiansandberg opened 6 years ago

christiansandberg commented 6 years ago

This is a general issue for adding support for creating slave nodes. This could be used for implementing real nodes on a CANopen network (using e.g. RaspberryPi) or simulating nodes in a test environment.

This will be divided up in smaller subtasks:

christiansandberg commented 6 years ago

Since the library now has to handle simultaneous connections, I can see three ways of handling this:

In my opinion, state machines are hard to implement and read. The control flow is not obvious. Threads are difficult to manage and debug. Asyncio allows writing simpler worker routines for each service but it only works in Python 3 and we should probably support both asynchronous and traditional synchronous programming.

Any opinions on this? How many still need to use Python 2.7?

TobiasPleyer commented 6 years ago

I agree with the above points. State machines can get messy and hard to read, but I think in our case the set of possible states and transitions is very limited. In such a situation I think state machines would still be a viable option. I personally have some experience with asyncio and it does make things easier (even though the mechanics behind it are not trivial), but the missing 2.7 support really is an issue. I want to add a conceptual idea: local and remote nodes "Local" means handled by this library, while "remote" means impersonated by another device on the bus. This concept is easy to read and understand and allows for shortcuts: When two "local" nodes communicate with each other there is no need to use the hardware bus, instead the communication can be directly handled internally by the library in software.

TobiasPleyer commented 6 years ago

Question: Do we really need multiple connections? Isn't everything stream-lined by the underlying can library?

christiansandberg commented 6 years ago

I've started to implement something simple using state machines. It might not need to be very complicated if we don't need to have advanced fault handling like unexpected requests or timeouts. You can create either remote or local nodes and add them to the network with similar interfaces. I'll upload a branch soon.

christiansandberg commented 6 years ago

There is a branch called local-nodes. I've implemented basic support for SDO expedited and segmented upload and download. You can either write to the local node to set values in the object dictionary, or you can register callbacks that can be set up to handle whenever an upload or download occurs. I'm open for suggestions and improvements!

TobiasPleyer commented 6 years ago

Thanks for the quick first draft!

Until now I just had the chance to have a first glance and I like the restructuring you did. I hope I can give you a more in-depth feedback soon. For now I'll have to look into this in my private time. On this occasion I'd like to ask you how you do verification and testing of the library? What is your test setup? I am left a bit clueless here...

Just from reading over it several times I have the feeling the (Blocked)StreamReader/Writer implementations still have potential for future optimizations (shrink in code size). Also the fact that there exist so many "Variable" etc. classes seems like it could be generalized.

I personally consider SdoClient and SdoServer and the PDO transmit/receive functionality as a "trait" or API guarantee of a node. Hence I would see a different node to inherit those traits depending on its type (local/remote). E.g. class LocalNode(BaseNode, SdoClient, SdoServer, PdoTransmit, PdoReceive). If a SDO upload request is sent on the bus the library could extract the respective node id, look if this id is in the local nodes list and only then it knows that it can serve this request.

I hope I expressed myself comprehensibly.

christiansandberg commented 6 years ago

There are some tests in the test folder, often based on how the CAN communication should look like. They are more like integration or acceptance tests rather than pure unit tests. Other than that it’s just based on using it in the real world. I developed it during my previous employment but currently I don’t have anything to test against unfortunately.

The implementation can most certainly be optimized. I wanted to be able to read a few bytes at a time using SDO so that’s why I implemented it as I/O streams.

The Variables are also a bit confusing I agree. I wanted to share logic between Object Dictionary, SDO and PDO. If you have a better way of organizing it I’m open for suggestions.

A node can have multiple SDO servers and also I wanted to make it clear which service you are using. A variable could for example be read or written using either SDO or PDO with similar API. That’s why a node has a bunch of attributes representing services instead of using inheritance.

TobiasPleyer commented 6 years ago

I can understand your thinking behind this concept. It allows a nice low level control of things. I am currently looking a lot into your code and try to play around a bit with alternatives. One concern I have is how this library can fit into a "bigger picture". I am thinking of my concrete use case for which I want to use this library: as one building block in a big asyncio application. Currently this library makes explicit use of threads, which makes it not ideal to integrate with asyncio. But on the other hand it has to be possible to use it stand-alone of course...

Currently I don't have satisfying answers to those points. I just wanted to keep you updated and informed that I'm still on it ;)

PS: I have 1 question with SdoClient.request_response. I wonder why this doesn't deadlock? If I read the code right then, in the same thread, we send a message and immediately block on reading the queue. But the queue is only getting filled with on_response which shouldn't be called because we are blocking because of the queue.get.

christiansandberg commented 6 years ago

Receiving of CAN messages is done in a different thread. It's somewhat hidden behind the can.Notifier class. So the main thread blocks until a response is received in the background thread.

To use this with asyncio you should use run_in_executor to make the blocking operations non-blocking. You can use one executor per node or a pool of executors.

vendor_id = await loop.run_in_executor(executor, node.sdo['Identity object']['Vendor-ID'].read)

Then you should be able to feed messages in your event loop to Network.notify(), see the docs.

fredrikhoyer commented 6 years ago

Hi

Just for your information I have forked your project and are working on extending the canopen slave support that you have added.

The intension is to use this for automated test of the canopen master that we have developed. The canopen master is based on the canopen stack from Port. So far we have been using Vector for testing, but I think that it will be easier to automate the test set-up with a Python canopen slave.

I plan to extend the NmtSlave class that I have added with more functionality.

I'll initiate a Pull Request once I have the canopen slave working towards the master.

christiansandberg commented 6 years ago

Great! It will be very much appreciated.

fredrikhoyer commented 6 years ago

Hi Tobias

@TobiasPleyer

I think that it is better to follow up your comment here. You wrote:

Hi Christian, hi Fredrik

I am happy to see both of you working on the server support. A couple of months ago I also uttered >my interest in this extension. Back then I didn't have green light for the open source contribution. This >has now been settled and I would be happy to support you guys. @fredrikhoyer Please let me know what is your course for the next changes and how I can support >you with it. For the next 1-2 weeks I should have a nice time budget to contribute.

Thanks for your help Tobias

In my mind I think that it is a good idea to look at @christiansandberg initial post related to this issue and see where we are.

Christian identified the subtask below. I added what I consider the status of these tasks.

So the next on the list I would think is PDO Slave support.

Since what I needed was the top 4 subtasks I haven't thought much about the PDO support. But PDO support would be of interest for me. Maybe @christiansandberg has some thought around how the PDO slave support could be structured.

PS: I can only work on this after I finished my day-time job, so my contribution is best-effort only.

christiansandberg commented 6 years ago

Thank you for your contribution! I added a quick EMCY producer too (without any inhibit time or anything).

Next big thing is probably PDOs. I haven't thought too much about it. A lot of the code can probably be shared. We probably need to monitor changes to PDO configuration and values and then update the data that's being transmitted.

TobiasPleyer commented 6 years ago

I agree, PDOs are the next big thing on the list. I am currently working out a first draft. PDOs are not so trivially implemented because, if they are configured to trigger on change, they need a trap of some sort to catch data changes performed to the data_store of the controlling LocalNode instance. I hope I can push a draft for review soon.

TobiasPleyer commented 6 years ago

Hi

I have pushed the first draft version for PDO support. The current status is definitely work in progess but I wanted to collect your feedback as soon as possible.

Regards, Tobias

christiansandberg commented 6 years ago

The code is now in master but not very well tested and documentation is pretty much non-existing.

braunsonm commented 6 years ago

Hey @christiansandberg I'm really interested in the async support for this library. Any idea when we might get a release?

Is there anyway I can assist in testing? Specifically interested in the PDO support.

christiansandberg commented 6 years ago

So far the slave nodes will work well in an asynchronous environment as they don't do any blocking operations, they basically just respond to incoming CAN traffic. So if you only need to provide slave nodes you can with a little custom code run this library asynchronously in the main thread. I'm adding asyncio support to python-can so then it will be even easier.

The master functionality is a bit worse. SDO operations are blocking so if you need to use it with asyncio you will need to use run_in_executor to make it non-blocking. Other services like PDO, EMCY, and heartbeat can use callbacks to signal when those are received.

How are you planning to use it?

christiansandberg commented 6 years ago

However, as soon as we need to do time related things in the slave nodes, like timeout handling or throttling of messages (e.g. EMCY inhibit time), the current state machine solution won't work. Then we may need to make it asynchronous anyway. Making the master API asynchronous would also make sense since you can easily communicate with several nodes simultaneously (like reading PDO configurations from all at once).

I have been thinking for a long time if it is possible to support asyncio somehow. It won't be possible without dropping support for Python 2.7. It may be possible to provide a synchronous API but it won't be pretty. The best way forward would probably be to fork this project into something like aiocanopen and replace all synchronous operations with asynchronous equivalents. The SDO code would have to be completely rewritten but otherwise it should be fairly easy to keep the two projects similar enough to merge changes between them.

Would that be interesting for someone?

acolomb commented 6 years ago

To me it sounds like a nightmare having to merge changes between an asynchronous fork and this version. If someone needs it, I'd suggest to implement it in an independent module within this package, like from canopen import sdo-async as sdo.

As far as I'm concerned, a synchronous SDO API is totally sufficient on the master side as Python makes it painless to wrap it in a thread if needed. SDO slave support is a different story, but I can imagine it would even work within the receiving thread as long as the response only involves a dictionary lookup. For longer transfers, we'd need worker threads like e.g. web servers have. I haven't looked at any potential SDO locking issues, but making the state machine worker thread local would be a feasible solution.

braunsonm commented 6 years ago

Apologies @christiansandberg I'm more interested in the async support that comes along with this. Not specifically in creating slave nodes. For SDO calls using run_in_executor is probably the best current solution, I thought this issue was tracking adding native async support however.

af-silva commented 6 years ago

@christiansandberg just an idea, for asynchronous we could use an external API library like the ones of PySignal or PySide that implements the Signal Slot mechanism used by the Qt Framework and works both on python 2.7 and 3.x It's just a thought.

gsulc commented 4 years ago

Why support Python 2.7? It's old. It's losing support and won't be maintained after 2019. This is a fairly new package. Who in their right mind would choose to develop a new application around Python 2.7?

christiansandberg commented 4 years ago

We can certainly drop support for Python 2.7 now if it makes sense. The next problem is how to provide both synchronous and asynchronous APIs.

gsulc commented 4 years ago

Good question. What is the goal? What are the bottlenecks of a LocalNode? I think we should start by listing those situations.

If I start answering my own question: it looks like the callbacks could take up too much time. If I'm not mistaken, they are synchronous. Anything else?

christiansandberg commented 4 years ago

That’s one issue with today’s synchronous implementation that some have discovered, you can’t do any blocking operations in your callbacks.

Another issue is that the SDO client operations are blocking the main thread so you can’t efficiently communicate with multiple nodes simultaneously.

I tried implementing an asyncio ISO-TP protocol and also provide a synchronous version of the API. I did that by creating an event loop and executing it in a new thread. The main thread used thread safe asyncio calls to start tasks and get results. It’s far from pretty but might work.

bggardner commented 4 years ago

What's the latest on this? As I mentioned in #188, my socketcanopen module provides for standalone master/slave nodes using threading to handle sending/receving/processing messages, including SDO. PDO support is limited and untested. Overall, it's far from pretty, but I've been using it on Raspberry Pis for a few years now as CANopen masters. See the example node applications.

I'd really like to transfer this functionality to this (canopen) repo, but believe it warrants some discussion first. I took the approach of letting each app/node connect to a can.BusABC and handle their connections individually, while canopen appears to use the Network class instead. I also developed socketcanopen using a home-grown socketcan module, but recently decided to switch to python-can. So, now I could take advantage of the can.Notifier class instead.

In short, given the different initial approaches between canopen (omniscient master) and socketcanopen (standalone node), should we attempt to merge them into one (maybe target canopen v2.0?), or keep them separate? Looking forward to everyone's thoughts.

af-silva commented 4 years ago

Hi @bggardner , I don't to compare this work with yours since I don't know your code and I don't want to be unfair. That being said you are more than welcome to contribute to this library. @christiansandberg is the main man responsible for most (90%) of the work, i just tweak it a little and add support for DS402 (it's not the most elegant solution but is working).

Regarding my usage of the library I use it as a Master without any hiccup to control remote DS402 supported devices. Everything works as expected, from the SDO and PDO server to the handle of EMCY messages. One open issue/improvement open here is the capability of creating slave nodes using this library, in other words implement and SDO and PDO clients capable of storing and reading values to/from an object dictionary. I think you have implemented this functionality through the use of threads and they have been discussed here and maybe can be a way to go.

So i think we have 3 main different scenarios (I might be wrong): 1) Simulate the CANBus interface and launch the master and slave nodes associated to the same Network object; 2) Have an external Master and allow more that one node aka slaves to connect tothe same Network (CANBus Interface Hardware) and communicate with the master; 3) Have an external Master and allow more that one node to connect to different Networks (CANBus Interfaces Hardware) and communicate with the master;

Depending on the scenario we may need to provide asynchronous and synchronous communication with the library and the different parts of it. Like @christiansandberg said, we my use a main thread loop that at each pass executes a stack of calls to executed different schedules task, emulating for instance a micro controller. This behavior must defined and i think some caution must be taken regarding execution times and timeouts on some operations.

What are you suggestion, what parts of your API you think can be ported and enabled in here??

Another Milestone should be to drop support for python 2.7 and migrate to everything to python 3

christiansandberg commented 4 years ago

I've been thinking about it for a while but haven't come up with much. The slave node support is only partial and it probably has a long way to go before it can be considered production ready.

My initial approach was to be able to re-use the same API for both slave (local) nodes and master (remote) nodes as much as possible, but in hindsight that starts to seem a bit weird and unnecessary. In most cases you will be either a master or a slave, so maybe there's not much use in trying to combine them under the same roof. In that case maybe it is better to have one library for masters and one for slaves? Although they could share some common code.

As I mentioned earlier I still think asyncio could be a nice fit for a slave implementation as you could easily process various tasks in parallel without needing threads or state machines. It is quite similar to a HTTP server after all.

bggardner commented 4 years ago

Thanks @af-silva and @christiansandberg for keeping the discussion going. I agree with your scenarios and dropping support for Python 2.7. I vote (if I have one) for using a common library (this one) to support slaves, masters, etc., especially when we decide to add support for CANopenFD, CANopenXL, etc. in the future. I'm relatively new to asyncio, so it might take more time to go that route, but I'm not opposed.

I believe there can be lots of confusion when talking about a CANopen "Master", as that can mean lots of things. Is it a full CANopen node (with a node-ID and an object dictionary), or just a participant on the CAN bus that uses some CANopen protocols?

In my opinion, there can be one or more CANopen "nodes" that can perform each of the functions of a "Master", such as a SYNC and TIME producer, in addition to the typical SDO, PDO, etc. protocols that slaves use, and the enabling/disabling of features are all determined by the relevant entries in their object dictionary. (Example: Masters have bit 0 set in object 0x1F80; slaves either don't have this object or the bit is cleared.) So, there really isn't a "slave" or "master" class, but just Node. This (the Node class) is the biggest part of my API (socketcanopen) that I want to transfer, which includes an SDO client, TPDOs, SYNC producers, heartbeat consumers, flying master (CiA DSP 302-2) support, and soon, network redundancy (CiA DSP302-6).

However, many times you want to communicate using SDOs and PDOs (and other protocols) with controlling or monitoring applications that don't need to send heartbeats or have node-IDs and object dictionaries. These would use a separate class (not Node, but maybe Participant, or something generic like that - maybe this is what the canopen.Network class was meant to be?). These may be used in protocol adapters (CAN-to-WebSockets, e.g.), or maybe to bridge different physical networks together.

So, if we have canopen.Node, and canopen.Participant classes, how should they communicate with the can.BusABC class? Using canopen.Network (as is currently done), or can.Notifier directly (as is currently done inside canopen.Network, and similar to socketcanopen), or something else? I'm just brainstorming here, so feel free to criticize.

christiansandberg commented 4 years ago

My knowledge of CANopen is really quite basic and outdated, you know a lot more than I do. I developed this library for simple control of CANopen devices in a test environment due to lack of other options, but did not consider use-cases where it would be a compliant member on a real CAN bus. It may not be possible to accomplish that without a completely different API, but if you could find a way to merge the two libraries without too much external changes to the interfaces I'm all ears.

The canopen.RemoteNode objects are representations of other "slave" nodes with various APIs to control them. The canopen.Network class is mostly a bridge between the CAN bus and the node objects and also contains services not directed to a specific node (like SYNC and NMT broadcast messages). This is a very crude simplification of what CANopen really is. It just made sense to me in the context I used it.

Bridging CAN buses, if it is protocol agnostic, may be better suited outside a CANopen library. I have also developed a WebSocket based interface for python-can for that purpose.

sveinse commented 3 years ago

I have started some work porting canopen to support asyncio. See issue #272