Closed marcfir closed 4 months ago
Hi Marc
Going to Safeop in the Rust wrapper causes unexpected idx received in HW_RX like
HW_RX : received idx 133, but we did not send one?
.
IMHO that should never happen. I usually get this if there are more than one EtherCAT master instances accidentally sending packets over the same interface.
Simply calling the wrapper functions in Rust according to
example_with_dc.c
should result in the same behavior. Adding someprintf
to send and recv showed that there are multiple packets being sent with the same idx. Is this a timing issue?
If it is the only processes it might be timing. But i don't get the point because the indices are only reused if they have also been received after the round-trip, or some timeout occurred (which i guess should be very high and usually the packet duration is less than 100us). Do you get some timeout messages? How are the priorities/scheduling of the TX and RX threads?
If I understand correctly, TX and RX are handled in different threads. The TX in the DC example is handled by the
cyclic_task'. The
ec_send_distributed_clocks_sync' andec_send_process_data' queues datagrams in the tx_high queue which are then sent with
hw_tx'.
Yes, that's the case if you are using the SOCK_RAW interface. If you using one of the hacked drivers ('linux' subfolder) then they can be configure in ethercat polling mode. In this mode the nic does not generate interrupt and after the TX it waits until the frame returned (so there is no RX thread).
The ecat.rx thread is spawned during
ec_open
and checks theidx
in the datagram onhw_process_rx_frame
.tx_send[p_entry_dg->idx]
is set before the actual sending.
You mean the RX thread receives a datagram with idx A before the TX thread marked idx X in the tx_send array? This should not be possible because the frame is filled with datagrams here https://github.com/robert-burger/libethercat/blob/d9c124be9059a71cc81b2941cf2ec3a9b1736b1a/src/hw.c#L297 and marked in tx_send. And this happens before the send done here https://github.com/robert-burger/libethercat/blob/d9c124be9059a71cc81b2941cf2ec3a9b1736b1a/src/hw.c#L283
Just to mention i have no experience in programming in Rust, but is the wrapper code accessible anywhere?
Hi Robert,
Do you get some timeout messages? How are the priorities/scheduling of the TX and RX threads?
I get timeouts. Here is the log (log.txt)
PS gives the following. So 99 and 100. This is equal to running the example_with_dc
on my system.
11796 main TS 19 libethercat-rs/main
11796 ecat.rx FF 99 libethercat-rs/main
11796 ecat.asy TS 19 libethercat-rs/main
11796 ecat.mbx TS 19 libethercat-rs/main
11796 ecat.mbx TS 19 libethercat-rs/main
11796 ecat.mbx TS 19 libethercat-rs/main
11796 ecat.mbx TS 19 libethercat-rs/main
11796 ?M7 FF 100 libethercat-rs/main
Ah I see. And getting an new idx https://github.com/robert-burger/libethercat/blob/d9c124be9059a71cc81b2941cf2ec3a9b1736b1a/src/idx.c#L56 is protected with a mutex.
So the problems are timeouts?
Just to mention i have no experience in programming in Rust, but is the wrapper code accessible anywhere?
Not yet. I have to do some clean up.
Yes i think it's timing (and scheduling) here. Because if the sender gets a timeout (for whatever reason) the idx is returned to the queue and reused later.
Btw: None of the threads should produce that much cpu load. I'm currently wondering why the socket timeouts for the SOCK_RAW interface are configured to 1us. (https://github.com/robert-burger/libethercat/blob/d9c124be9059a71cc81b2941cf2ec3a9b1736b1a/src/hw_sock_raw.c#L136). Can you try increasing that to e.g. 1000 us? Because then at least the ecat.rx should have much less cpu load. And because one MTU size frame has to return in less than that, this should be fine here.
I found the problem. I missed the https://github.com/robert-burger/libethercat/blob/d9c124be9059a71cc81b2941cf2ec3a9b1736b1a/tools/example_with_dc/example_with_dc.c#L91 in the cyclic task. So the thread flooded the network.
Hi Robert, I am working on a Rust wrapper for libethercat. I am using the
example_with_dc.c
as a reference. The DC example works in C with my demo setup. Going to Safeop in the Rust wrapper causes unexpected idx received in HW_RX likeHW_RX : received idx 133, but we did not send one?
.Simply calling the wrapper functions in Rust according to
example_with_dc.c
should result in the same behavior. Adding someprintf
to send and recv showed that there are multiple packets being sent with the same idx. Is this a timing issue?If I understand correctly, TX and RX are handled in different threads. The TX in the DC example is handled by the
cyclic_task'. The
ec_send_distributed_clocks_sync' andec_send_process_data' queues datagrams in the tx_high queue which are then sent with
hw_tx'.The ecat.rx thread is spawned during
ec_open
and checks theidx
in the datagram onhw_process_rx_frame
.tx_send[p_entry_dg->idx]
is set before the actual sending.