Open DJtime opened 9 months ago
Thanks a lot for your work. This looks really interesting and it would be a nice addition to INET. We had a bit of time to briefly look into the PR and the code generally looks OK but it would need some more refining before merge. Are you willing to invest some more time? If so, we can continue discussing the details.
Yes, i can invest some more time. Although my master thesis and study will be finished, i am willing to support this topic until merge is completed.
This is great, thank you! Rudolf and I looked at the model together yesterday, and agreed that the thing we missed most was high-level documentation. Information like: what part of IEC 62439-2 does the model implement, and, which parts of spec are NOT included in the model? How do concepts in the specification represented in the simulation model? E.g. how switch roles like Media Redundancy Master (MRM), Media Redundancy Client (MRC), Media Redundancy Checker (MRCk), Auto Client (AC), Interconnection (IC) manifest themselves in the model? How to set up single-ring and multi-ring networks?
As a side note, can you give us a hint (maybe via email) where to get detailed info about the IEC 62439-2 protocol? Buying the spec isn't exactly an option, and apart from that, so far we've only found a couple of web pages with various bits of overview-level information.
Thanks for the docs, we are currently reading them, and familiarizing ourselves with the model.
You asked: "I can write some more comments regarding roles and set-up. Where would be the appropriate place, header file or ned-file?" It should go into the NED file, right above the modules definitions. Every module should have a few lines of docu that answers questions like:
The NED docu should be useful for a potential user of the model. By reading it, the person should be able to decide whether this is the module s/he is looking for, whether it implements all features s/he needs for the simulation, etc.
In any case, the NED docu should not go into C++ details, because it's irrelevant on that level. If you catch yourself writing down the names of C++ methods, classes, enum values etc, that's the wrong track, please stop :)
You can find examples browsing existing doc: https://doc.omnetpp.org/inet/api-current/neddoc/index.html
Thank you for adding the comments, great work! It is also extremely useful that you added detailed comments for parameters, too. Much appreciated! Documentation was the most important thing we wanted to get right, because writing it without your help would be quite hard for us to do.
Tests
tests/networks/mrp
is great but it currently does not really function as a test, because it lacks automated checking of outcome (there's no PASS / FAIL output). We suggest placing a copy of it under examples/
as a dedicated example simulation; meanwhile we are pondering how to best turn the copy under tests/
into a real test.Some smaller observations:
NED Parameters:
int expectedRoleByNum
-- should be type string
with @enum
; BTW why is it called "expected"?
Not all parameters are read by the code: processingDelayMean
, processingDelayDev
, linkDetectionDelayMean
, linkDetectionDelayDev
, possibly others (to be checked).
linkDetectionDelayMean
, linkDetectionDelayDev
are currently of type object
in NED which does not seem warranted; we recommend replacing them with a single volatile double
parameter: volatile double linkDetectionDelay @unit(s) = default(truncnormal(...));
and their current usage linkDetectionDelay = truncnormal(linkDetectionDelayMean, linkDetectionDelayDev);
replaced with linkDetectionDelay = linkDetectionDelayPar->doubleValue()
where linkDetectionDelayPar
is a cPar*
pointer.
Statistics:
emit(RingStateChangedSignal, simTime().inUnit(SIMTIME_US));
. This is redundant because the signal already carries the timestamp, and is recorded into an output vector together with it. We recommend choosing some other meaningful quantity as signal value, e.g. for RingStateChangedSignal
the current ring state (OPEN
, CLOSED
, etc)Module name:
MediaRedundancyNode
and InterconnectionNode
are a bit confusing now, because "nodes" in INET normally refer to devices like routers or switches, and not to protocol implementations like MRP. Modules that implement protocols are normally simply named after the protocol, e.g. Tcp
or Ipv4
or its subcomponent like Ieee80211Mac
. Maybe rename the module types to Mrp
and MrpInterconnect
or something like that?Tests: I have copied the existing test (after reworking) into examples/mrp. NED Parameters: expectedRolebyNum: Mrp standard calls the parameter internally expectedRole, but I renamed it for clarity to mrpRole. Switching it to string @enum would (as far as I can tell) require an additional parsing function (similiar to parseFcsMode function in linklayer/common/FcsMode.cc), so i left it as INT for now. I reworked all delays and ned parameters and finally found the error which prevented MrpInterconnection NED from inheriting properly from Mrp NED in the first place.
Signals: All signals now carry some other value than SimulationTime
Module Name: MediaRedundancyNode was renamed to Mrp, InterconnectionNode is now MrpInterconnection
Thanks for the update! Please give us some more time to review your changes.
Update: I took over your commits and added them into INET under the branch topic/mrp
link. This looked practical, as it allowed me to make a number of cosmetic changes to the code to better fit INET's coding style without pestering you. This is still work in progress, so expect some more to come. You can probably pull my changes back to your repo if you want to make further changes, but I'd wait a few days. Also please be warned that I sometimes rebase and force-push the topic branch, especially the last few commits.
I also need to play more with the model to get the hang of it. Could we add some WATCHes into the module, and/or display brief status information above the mrp
submodule in the simulation using "t" display string tags? That would help figuring out at runtime what's going on in the model. For the latter, what info would you suggest?
Also, could we extend the descriptions of the Mrp
and MrpInterconnect
modules, i.e. the comment blocks above the modules, with more info on what prominent protocol features the module DOES and DOES NOT implement? It would help to set the expectations of any future user, and also hint to possible future contributors on where the code needs improvement.
Additional question: I found simtime-resolution = us
in omnetpp.ini
-- what is the reason it is there? is it required?
the simtime-resolution comment should not be necessary in theory.... but I have never tested it without it. I will take a look into it and will also given you feedback to watches and additional comments. Currently the standard is fully implemented. Only the Link-check is rudimentary as it is an additional standard (802.1Q)
Question: Mrp::setupLinkChangeReq()
contains the following:
if (linkState == NetworkInterface::UP) {
linkChangeTlv->setHeaderType(LINKUP);
} else if (linkState == NetworkInterface::DOWN) {
linkChangeTlv->setHeaderType(LINKDOWN);
} else ...
...
linkChangeTlv->setBlocked(linkState);
The last line looks a bit suspicious -- so the blocked
field and the header type are both set according to the linkState variable, so the header type always determines the state of the blocked
field. Isn't the blocked
field supposed to be able to carry some independent meaning? I found "8.1.22 Coding of the field MRP_Blocked" but I'm not familiar enough with the protocol to decide.
You are right. I have looked it up. According to Table 36 on page 67 in standard the field should simple be set to "1". Function description can be found on page 105.
Thanks for the info! It additionally led to 2 fixes, where linkState was mistakenly obtained from the MRP_Blocked and MRP_LinkInfo fields of the LinkChange/InLinkChange packets -- I believe the correct way is to look at the header type (whether it is LINKUP or LINKDOWN, etc). commit 44b2563dde2a3ccdf321353743251db708740ebb
I have some questions. The spec talks about LC-mode and RC-mode for MIMs -- are these the same as the linkCheckEnabled and ringCheckEnabled bool variables in the code? Since section "8.2.7 MIM protocol machine" (page 109) defines two different (slightly different?) state machines for LC-mode and RC-mode, is it possible that linkCheckEnabled and ringCheckEnabled cannot be both enabled at the same time? i.e. exactly one of the them should be enabled? I don't see lines in the code to ensure that.
yes, the bool variables linkCheckEnabled and ringCheckEnabled represent the two modes LC and RC. There are 1 or two concrete differences in the state machines but mostly in RC the LC code is extended and each node has to do more than in LC.... as in normal mrp rings it is explicitly possible to enable link check side by side with the ring test, I have assumed that in interconnection it should be possible to have both methods simultaneously available too. But rereading the explanation of both modes on page 20, one could understand it the way that only one mode may be active... It is no explicitly requested that both modes should be excluding each other or working in parallel.
Thank you for the info. I was asking because I was wondering that maybe instead of the two boolean parameters there should be a single interconnectionMode
parameter with the allowed values UNDEFINED
, LC_MODE
, and RC_MODE
, as described in the list on page 42. What do you think?
BTW, kudos for implementing the protocol! The spec is not exactly the easiest read I have encountered. I am slowly making progress with the review, but since I also have other tasks to work on, it's not going fast. Thanks for your patience.
I just discovered that the bool checkMediaRedundancy
parameter of Mrp
has no implementartion. The code just reads it into the class member of the same name, and doesn't use it. Now the question is whether it should be (A) implemented, or (B) removed. The spec says on p41, "Check Media Redundancy: This attribute specifies whether monitoring of MRM state is enabled (TRUE) or disabled (FALSE) in the redundancy domain.", also pn p44 and p50: "Check Media Redundancy:
This parameter selects whether monitoring of MRM state is enabled or disabled."
Do I understand correctly that this is sort of the main switch to turn on/off MRP as a whole? Then I think the parameter can be removed (option B). If someone wants to turn off a MRP in a switch, they can remove the mrp
submodule from the switch via NED or ini configuration.
The Mrp
module has a parameter called noTopologyChange
. Does this correspond to the NO_TC
variable in the spec, described in "Table 40 – MRP Local variables of MRM protocol machine"
on on p70 as with the comment "Suppress MRP_TopologyChange while in line topology"?
If so, then it is supposed to be an internal variable of Mrp
, not a configurable parameter. A search for NO_TC
in the spec reveals that various functions set it to TRUE
or FALSE
under various conditions, so it cannot be a configurable parameter.
Sorry for my longer period of silence. But I needed some space from MRP for a while after one year of deep dive.
"I was wondering that maybe instead of the two boolean parameters there should be a single interconnectionMode parameter with the allowed values UNDEFINED, LC_MODE, and RC_MODE, as described in the list on page 42. What do you think?" --> As it does not very much sense to have both enabled, it could help to prevent accidental misconfiguration
checkMediaRedundancy: I have always interpreted it as a way to enable the active Testing of the MRP ring.... So if disabled, only local link detection is done. But this one of the many points where MRP is not very specific.... I have not found any point in spec where this variable is referenced.
noTopologyChange: Yes it corresponds to the internal variable and therefore the parameter should be removed from NED file
I am currently reviewing the changes you already have done and will check the model further.
No worries, I really appreciate your continued support! My apologies for constantly rebasing/editing the last few commits of the branch, I am doing it in order to create a cleaner change history where my mistakes are easier to spot.
My commits are mostly stylistic changes, they do not change the operation of the code. I took care that the fingerprints are unchanged throughout all my commits (only LargeNet model fingerprints change at one point, because I changed the network.)
Adding protocol MRP (IEC 62439-2). Media Redundancy Protocol acts similar to RSTP, but provides a upper time limit needed for recovery in case of link failure.