DUNE-DAQ / iomanager

Package providing a unified API
0 stars 0 forks source link

Could ConnectionInstanceNotFound in NetworkReceiverModel be reported as an ERS error? #48

Open bieryAtFnal opened 1 year ago

bieryAtFnal commented 1 year ago

I noticed that the error messages associated with this line were not being noticed by the log file checking in our integtests.

The reason was that the severity of the message is "LOG" based on the use of TLOG().

Could this line be changed to report an ers::error?

eflumerf commented 1 year ago

Interesting...I recently changed that from an error to a log since it was occurring in places where there wasn't actually a problem...

(dbt) [eflumerf@ironvirt7 network]$ git show 19576e85
commit 19576e8565c9a9cbf78418112c2892dc224d0aa4
Author: Eric Flumerfelt <eflumerf@fnal.gov>
Date:   Thu Feb 9 08:12:08 2023 -0600

    Change error message to log in try_receive

diff --git a/include/iomanager/network/NetworkReceiverModel.hpp b/include/iomanager/network/NetworkReceiverModel.hpp
index 359534a..584818d 100644
--- a/include/iomanager/network/NetworkReceiverModel.hpp
+++ b/include/iomanager/network/NetworkReceiverModel.hpp
@@ -149,7 +149,7 @@ private:
     std::lock_guard<std::mutex> lk(m_receive_mutex);
     get_receiver(timeout);
     if (m_network_receiver_ptr == nullptr) {
-      ers::error(ConnectionInstanceNotFound(ERS_HERE, this->id().uid));
+      TLOG() << ConnectionInstanceNotFound(ERS_HERE, this->id().uid);
       return std::nullopt;
     }
eflumerf commented 1 year ago

At a guess, I would say that it should be a log for "connect"-type endpoints (Senders and Publishers), and an error for "bind"-type...which means additional logic should be added somewhere to distinguish those cases in NetworkReceiverModel and NetworkSenderModel (Line 137 is currently a TLOG as well)...

bieryAtFnal commented 1 year ago

In the particular scenario that I saw, the problem occurred when I tried to create a receiver with an empty string for the connection ID. Of course, it's true that this was a bug in my code, but it would have been nice for the error to be caught by the integrationtest log checking.

Here is the TLOG output, in case it is of any use...

log_dqmrulocalhost0_4337.txt:2023-Mar-01 15:12:29,364 LOG [typename std::enable_if<dunedaq::serialization::is_serializable<MessageType>::value, std::optional<_Up> >::type dunedaq::iomanager::NetworkReceiverModel<Datatype>::try_read_network(const dunedaq::iomanager::Receiver::timeout_t&) [with MessageType = dunedaq::dfmessages::TRMonRequest; Datatype = dunedaq::dfmessages::TRMonRequest; typename std::enable_if<dunedaq::serialization::is_serializable<MessageType>::value, std::optional<_Up> >::type = std::optional<dunedaq::dfmessages::TRMonRequest>; dunedaq::iomanager::Receiver::timeout_t = std::chrono::duration<long int, std::ratio<1, 1000> >] at /cvmfs/dunedaq-development.opensciencegrid.org/nightly/N23-02-28/spack-0.18.1-gcc-12.1.0/spack-0.18.1/opt/spack/gcc-12.1.0/iomanager-N23-02-28-eqhipmafjmtaoxfrpm3v2uw5mzf3urql/include/iomanager/network/NetworkReceiverModel.hpp:152] 2023-Mar-01 15:12:29,364 ERROR [typename std::enable_if<dunedaq::serialization::is_serializable<MessageType>::value, std::optional<_Up> >::type dunedaq::iomanager::NetworkReceiverModel<Datatype>::try_read_network(const dunedaq::iomanager::Receiver::timeout_t&) [with MessageType = dunedaq::dfmessages::TRMonRequest; Datatype = dunedaq::dfmessages::TRMonRequest; typename std::enable_if<dunedaq::serialization::is_serializable<MessageType>::value, std::optional<_Up> >::type = std::optional<dunedaq::dfmessages::TRMonRequest>; dunedaq::iomanager::Receiver::timeout_t = std::chrono::duration<long int, std::ratio<1, 1000> >] at /cvmfs/dunedaq-development.opensciencegrid.org/nightly/N23-02-28/spack-0.18.1-gcc-12.1.0/spack-0.18.1/opt/spack/gcc-12.1.0/iomanager-N23-02-28-eqhipmafjmtaoxfrpm3v2uw5mzf3urql/include/iomanager/network/NetworkReceiverModel.hpp:152] Connection Instance not found for name