Closed jubeormk1 closed 1 year ago
Same problem with a single EL1004 slave:
[2023-07-11T02:43:25Z INFO ek1100] Starting EK1100 demo...
[2023-07-11T02:43:25Z INFO ek1100] Ensure an EK1100 is the first slave, with any number of modules connected after
[2023-07-11T02:43:25Z INFO ek1100] Run with RUST_LOG=ethercrab=debug or =trace for debug information
[2023-07-11T02:43:25Z DEBUG ethercrab::slave] Slave 0x1000 name EK1100
[2023-07-11T02:43:25Z DEBUG ethercrab::slave] Slave 0x1001 name EL1014
[2023-07-11T02:43:25Z DEBUG ethercrab::client] Configuring topology/distributed clocks
thread 'main' panicked at 'Init: WorkingCounter { expected: 2, received: 1, context: Some("Broadcast time") }', examples/ek1100.rs:83:10
stacktrace:
stack backtrace:
0: 0x558ab686f23a - std::backtrace_rs::backtrace::libunwind::trace::h9a6b80bbf328ba5d
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x558ab686f23a - std::backtrace_rs::backtrace::trace_unsynchronized::hd162ec543a11886b
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x558ab686f23a - std::sys_common::backtrace::_print_fmt::h78a5099be12f51a6
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:65:5
3: 0x558ab686f23a - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::ha1c5390454d74f71
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:44:22
4: 0x558ab688b8af - core::fmt::write::h9ffde816c577717b
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/fmt/mod.rs:1254:17
5: 0x558ab686c9d5 - std::io::Write::write_fmt::h88186074961638e4
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/io/mod.rs:1698:15
6: 0x558ab686f005 - std::sys_common::backtrace::_print::h184198273ed08d59
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:47:5
7: 0x558ab686f005 - std::sys_common::backtrace::print::h1b4d8e7add699453
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:34:9
8: 0x558ab68706ae - std::panicking::default_hook::{{closure}}::h393bcea75423915a
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:269:22
9: 0x558ab6870455 - std::panicking::default_hook::h48c64f31d8b3fd03
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:288:9
10: 0x558ab6870c0e - std::panicking::rust_panic_with_hook::hafdc493a79370062
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:691:13
11: 0x558ab6870b09 - std::panicking::begin_panic_handler::{{closure}}::h0a64bc82e36bedc7
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:582:13
12: 0x558ab686f6a6 - std::sys_common::backtrace::__rust_end_short_backtrace::hc203444fb7416a16
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:150:18
13: 0x558ab6870862 - rust_begin_unwind
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
14: 0x558ab66ba193 - core::panicking::panic_fmt::h0f6ef0178afce4f2
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
15: 0x558ab66ba5e3 - core::result::unwrap_failed::h8090202169109f9c
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/result.rs:1687:5
16: 0x558ab66c9798 - tokio::runtime::park::CachedParkThread::block_on::he68682098847e7b5
17: 0x558ab66c9c36 - tokio::runtime::context::runtime::enter_runtime::h57a07d4b664d4d90
18: 0x558ab672c101 - tokio::runtime::runtime::Runtime::block_on::h29109ff0d856e13d
19: 0x558ab66bf78f - ek1100::main::h1550c917b3a86bf5
20: 0x558ab672d756 - std::sys_common::backtrace::__rust_begin_short_backtrace::h15abfc1a7efe2366
21: 0x558ab66c1511 - std::rt::lang_start::{{closure}}::h5e04481cc6f084b8
22: 0x558ab68686fe - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hb1327dc2ef3fecdf
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ops/function.rs:287:13
23: 0x558ab68686fe - std::panicking::try::do_call::h4044173225fe83dd
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
24: 0x558ab68686fe - std::panicking::try::hd8a722c09d156a53
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
25: 0x558ab68686fe - std::panic::catch_unwind::hd2ca07971cf0119b
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
26: 0x558ab68686fe - std::rt::lang_start_internal::{{closure}}::h26d89d595cf47b70
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:48
27: 0x558ab68686fe - std::panicking::try::do_call::hf47aa1aa005e5f1a
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
28: 0x558ab68686fe - std::panicking::try::h73d246b2423eaf4e
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
29: 0x558ab68686fe - std::panic::catch_unwind::hbaaeae8f1b2f9915
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
30: 0x558ab68686fe - std::rt::lang_start_internal::h76f3e81e6b8f13f9
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:20
31: 0x558ab66bf865 - main
32: 0x7f03ac6f6d90 - <unknown>
33: 0x7f03ac6f6e40 - __libc_start_main
34: 0x558ab66ba775 - _start
35: 0x0 - <unknown>
Way more exciting when using a EL5151. No panic at all. The example reveals information about the slaves but since the EL5151 does not have any output nothing happens in the loop.
Info output:
[2023-07-11T02:55:18Z INFO ek1100] Starting EK1100 demo...
[2023-07-11T02:55:18Z INFO ek1100] Ensure an EK1100 is the first slave, with any number of modules connected after
[2023-07-11T02:55:18Z INFO ek1100] Run with RUST_LOG=ethercrab=debug or =trace for debug information
[2023-07-11T02:55:21Z INFO ek1100] Discovered 2 slaves
[2023-07-11T02:55:21Z INFO ek1100] -> Slave 0x1000 EK1100 has 0 input bytes, 0 output bytes
[2023-07-11T02:55:21Z INFO ek1100] -> Slave 0x1001 EL5151-0021 has 18 input bytes, 14 output bytes
Looking at the debug output there is a great deal of information about the discovered Sync managers and PDOs, which is great. Full output including debug:
Hey, thanks for the detailed issue! Much appreciated. I've got an EK1100 and EL2004 on my test rig and running the ek1100
example in release mode I can't reproduce your working counter panic I'm afraid. Can you rerun the EK1100 + EL2004 setup but get a Wireshark capture and post it here? Please run in debug mode for the capture, i.e. no --release
.
Just checking - you're using latest master
in your fork, right?
P.s. if you have the just
task runner installed (cargo install just
) you can run the examples with RUST_LOG=debug just linux-example-release ek1100
(release) or RUST_LOG=debug just linux-example ek1100
(debug) - I've set those commands up to use setcap
so you don't have to go searching for binaries in target/
.
Hi again @jamwaffles, I follow your advice and capture the packets while running the example.
I capture the packets in the ethercat interface while running RUST_LOG=debug just linux-example-release ek1100 ethercat0
in the next situations:
The logs and packets recorded are attacked can be accesed here: wireshark-ethercrab-EK1100.zip
To run wireshark in a headless machine you need to install wireshark and tshark. In an ubuntu machine that would be done by:
sudo apt install wireshark tshark
after that run:
> sudo tshark -i ethercat0 -F pcapng
I made some progress troubleshooting this issue:
Following the lead the problem happens at client.init<const MAX_SLAVES: usize, G>
when calling dc::configure_dc(self, slaves.as_mut_slices().0).await?
in client.rs.
I took that out of the equation and now and the debug output is describing all the sync managers, FMMUs, FMMUs mappings PDOs, etc. More here: output.log.
You can see the changes that I made in the next gist Hot fix for ethercrab at commit 29f329f.
I will keep working from there.
Thanks for the extra logs. I've got a small change for you to try if you don't mind. In src/dc.rs
, fn latch_dc_times
, look for this block:
// Latch receive times into all ports of all slaves.
client
.bwr(RegisterAddress::DcTimePort0, 0u32)
.await
.expect("Broadcast time")
.wkc(num_slaves as u16, "Broadcast time")?;
and replace it with:
// Latch receive times into all ports of all slaves.
client
.bwr(RegisterAddress::DcTimePort0, 0u32)
.await
.expect("Broadcast time");
(just delete the .wkc(...)
line)
This ignores the working counter and lets your program proceed a bit further to the point it prints some better debug info.
Please post the log of RUST_LOG=debug ...
and another wireshark trace here. Please don't run it with --release
as the test harness I wrote to replay Wireshark captures doesn't work well with release builds, making it more difficult for me to reproduce locally..
Hey @jubeormk1 this issue was closed by #69 (nice) but I can't confirm if the fix works all the way through. Please give latest master a test and reopen here if it still doesn't work :).
Hi again!
I have been working in other things and did not test this and did not realised that I was working on my "patched" version. I have more feedback for my particular system:
after pulling the latest version of the master branch the error has changed:
$ ~/ethercrab-rs/ethercrab$ just linux-example ek1100 enp5s0f2
cargo build --example ek1100 && sudo setcap cap_net_raw=pe ./target/debug/examples/ek1100 && ./target/debug/examples/ek1100 enp5s0f2
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
[2023-07-20T23:37:02Z INFO ek1100] Starting EK1100 demo...
[2023-07-20T23:37:02Z INFO ek1100] Ensure an EK1100 is the first slave, with any number of modules connected after
[2023-07-20T23:37:02Z INFO ek1100] Run with RUST_LOG=ethercrab=debug or =trace for debug information
thread 'main' panicked at 'Init: WorkingCounter { expected: 1, received: 0, context: Some("Port receive times") }', examples/ek1100.rs:84:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: Recipe `linux-example` failed on line 4 with exit code 101
The string "Port receive times" is only found in src/dc.rs:latch_dc_times:
let [time_p0, time_p1, time_p2, time_p3] = sl
.read::<[u32; 4]>(RegisterAddress::DcTimePort0, "Port receive times")
.await?;
Adding some silly logging in latch_dc_times inside the for slave in slaves loop
the first slave to fail is IL1004. the iterator is going over slaves that support and do not support dc:
[2023-07-21T00:02:21Z INFO ek1100] Starting EK1100 demo...
[2023-07-21T00:02:21Z INFO ek1100] Ensure an EK1100 is the first slave, with any number of modules connected after
[2023-07-21T00:02:21Z INFO ek1100] Run with RUST_LOG=ethercrab=debug or =trace for debug information
[2023-07-21T00:02:22Z INFO ethercrab::dc] Trying slave (address: "EK1100")
[2023-07-21T00:02:22Z INFO ethercrab::dc] Trying slave (address: "EL5151-0021")
[2023-07-21T00:02:22Z INFO ethercrab::dc] Trying slave (address: "EL1014")
thread 'main' panicked at 'Init: WorkingCounter { expected: 1, received: 0, context: Some("Port receive times") }', examples/ek1100.rs:84:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Here is my workaround that iterates over the slaves that support dc borrowing your filter for num_slaves_with_dc
.
Second attempt from #73 seems to make this work now. Please reopen (again lol) if this breaks (again).
I can confirm that it works for EK1100 + EL5151 +EL1004 + 5x EL2004. Thanks!
I just started learning how to use this library by running the ek1100.rs example.
I have a simple topology: Ubuntu 22.04 Server minimized and:
Cloning this repo to have stable support as commented in the matrix channel (very quick response! Thanks for that!) and running the mentioned example I get the next error:
I build the example with
cargo build --example ek1100 --release
.After that run it by
sudo RUST_BACKTRACE=full RUST_LOG=debug ./target/release/examples/ek1100 eth0
where eth0 is the name of the NIC where the EtherCAT equipment is plugged.For some more context this is the full stack trace:
I am also testing the c wrapper ethercat-rs so I had to install the IgH modules but all of that is disabled and stopped. I tried the example in interfaces previously configured as ec_master and others unassigned to ethercat with the same results.
I will post more information with more testing.