inet-framework / inet

INET Framework for the OMNeT++ discrete event simulator
https://inet.omnetpp.org
Other
442 stars 488 forks source link

Adding protocol MRP (IEC 62439-2) #947

Open DJtime opened 9 months ago

DJtime commented 9 months ago

Adding protocol MRP (IEC 62439-2). Media Redundancy Protocol acts similar to RSTP, but provides a upper time limit needed for recovery in case of link failure.

rhornig commented 9 months ago

Thanks a lot for your work. This looks really interesting and it would be a nice addition to INET. We had a bit of time to briefly look into the PR and the code generally looks OK but it would need some more refining before merge. Are you willing to invest some more time? If so, we can continue discussing the details.

DJtime commented 9 months ago

Yes, i can invest some more time. Although my master thesis and study will be finished, i am willing to support this topic until merge is completed.

avarga commented 9 months ago

This is great, thank you! Rudolf and I looked at the model together yesterday, and agreed that the thing we missed most was high-level documentation. Information like: what part of IEC 62439-2 does the model implement, and, which parts of spec are NOT included in the model? How do concepts in the specification represented in the simulation model? E.g. how switch roles like Media Redundancy Master (MRM), Media Redundancy Client (MRC), Media Redundancy Checker (MRCk), Auto Client (AC), Interconnection (IC) manifest themselves in the model? How to set up single-ring and multi-ring networks?

As a side note, can you give us a hint (maybe via email) where to get detailed info about the IEC 62439-2 protocol? Buying the spec isn't exactly an option, and apart from that, so far we've only found a couple of web pages with various bits of overview-level information.

avarga commented 9 months ago

Thanks for the docs, we are currently reading them, and familiarizing ourselves with the model.

You asked: "I can write some more comments regarding roles and set-up. Where would be the appropriate place, header file or ned-file?" It should go into the NED file, right above the modules definitions. Every module should have a few lines of docu that answers questions like:

The NED docu should be useful for a potential user of the model. By reading it, the person should be able to decide whether this is the module s/he is looking for, whether it implements all features s/he needs for the simulation, etc.

In any case, the NED docu should not go into C++ details, because it's irrelevant on that level. If you catch yourself writing down the names of C++ methods, classes, enum values etc, that's the wrong track, please stop :)

You can find examples browsing existing doc: https://doc.omnetpp.org/inet/api-current/neddoc/index.html

avarga commented 8 months ago

Thank you for adding the comments, great work! It is also extremely useful that you added detailed comments for parameters, too. Much appreciated! Documentation was the most important thing we wanted to get right, because writing it without your help would be quite hard for us to do.

Tests

Some smaller observations:

NED Parameters:

Statistics:

Module name:

DJtime commented 8 months ago

Tests: I have copied the existing test (after reworking) into examples/mrp. NED Parameters: expectedRolebyNum: Mrp standard calls the parameter internally expectedRole, but I renamed it for clarity to mrpRole. Switching it to string @enum would (as far as I can tell) require an additional parsing function (similiar to parseFcsMode function in linklayer/common/FcsMode.cc), so i left it as INT for now. I reworked all delays and ned parameters and finally found the error which prevented MrpInterconnection NED from inheriting properly from Mrp NED in the first place.

Signals: All signals now carry some other value than SimulationTime

Module Name: MediaRedundancyNode was renamed to Mrp, InterconnectionNode is now MrpInterconnection

avarga commented 8 months ago

Thanks for the update! Please give us some more time to review your changes.

avarga commented 8 months ago

Update: I took over your commits and added them into INET under the branch topic/mrp link. This looked practical, as it allowed me to make a number of cosmetic changes to the code to better fit INET's coding style without pestering you. This is still work in progress, so expect some more to come. You can probably pull my changes back to your repo if you want to make further changes, but I'd wait a few days. Also please be warned that I sometimes rebase and force-push the topic branch, especially the last few commits.

I also need to play more with the model to get the hang of it. Could we add some WATCHes into the module, and/or display brief status information above the mrp submodule in the simulation using "t" display string tags? That would help figuring out at runtime what's going on in the model. For the latter, what info would you suggest?

Also, could we extend the descriptions of the Mrp and MrpInterconnect modules, i.e. the comment blocks above the modules, with more info on what prominent protocol features the module DOES and DOES NOT implement? It would help to set the expectations of any future user, and also hint to possible future contributors on where the code needs improvement.

avarga commented 8 months ago

Additional question: I found simtime-resolution = us in omnetpp.ini -- what is the reason it is there? is it required?

DJtime commented 8 months ago

the simtime-resolution comment should not be necessary in theory.... but I have never tested it without it. I will take a look into it and will also given you feedback to watches and additional comments. Currently the standard is fully implemented. Only the Link-check is rudimentary as it is an additional standard (802.1Q)

avarga commented 8 months ago

Question: Mrp::setupLinkChangeReq() contains the following:

    if (linkState == NetworkInterface::UP) {
        linkChangeTlv->setHeaderType(LINKUP);
    } else if (linkState == NetworkInterface::DOWN) {
        linkChangeTlv->setHeaderType(LINKDOWN);
    } else ...
    ...
    linkChangeTlv->setBlocked(linkState);

The last line looks a bit suspicious -- so the blocked field and the header type are both set according to the linkState variable, so the header type always determines the state of the blocked field. Isn't the blocked field supposed to be able to carry some independent meaning? I found "8.1.22 Coding of the field MRP_Blocked" but I'm not familiar enough with the protocol to decide.

DJtime commented 8 months ago

You are right. I have looked it up. According to Table 36 on page 67 in standard the field should simple be set to "1". Function description can be found on page 105.

avarga commented 8 months ago

Thanks for the info! It additionally led to 2 fixes, where linkState was mistakenly obtained from the MRP_Blocked and MRP_LinkInfo fields of the LinkChange/InLinkChange packets -- I believe the correct way is to look at the header type (whether it is LINKUP or LINKDOWN, etc). commit 44b2563dde2a3ccdf321353743251db708740ebb

avarga commented 7 months ago

I have some questions. The spec talks about LC-mode and RC-mode for MIMs -- are these the same as the linkCheckEnabled and ringCheckEnabled bool variables in the code? Since section "8.2.7 MIM protocol machine" (page 109) defines two different (slightly different?) state machines for LC-mode and RC-mode, is it possible that linkCheckEnabled and ringCheckEnabled cannot be both enabled at the same time? i.e. exactly one of the them should be enabled? I don't see lines in the code to ensure that.

DJtime commented 7 months ago

yes, the bool variables linkCheckEnabled and ringCheckEnabled represent the two modes LC and RC. There are 1 or two concrete differences in the state machines but mostly in RC the LC code is extended and each node has to do more than in LC.... as in normal mrp rings it is explicitly possible to enable link check side by side with the ring test, I have assumed that in interconnection it should be possible to have both methods simultaneously available too. But rereading the explanation of both modes on page 20, one could understand it the way that only one mode may be active... It is no explicitly requested that both modes should be excluding each other or working in parallel.

avarga commented 7 months ago

Thank you for the info. I was asking because I was wondering that maybe instead of the two boolean parameters there should be a single interconnectionMode parameter with the allowed values UNDEFINED, LC_MODE, and RC_MODE, as described in the list on page 42. What do you think?

BTW, kudos for implementing the protocol! The spec is not exactly the easiest read I have encountered. I am slowly making progress with the review, but since I also have other tasks to work on, it's not going fast. Thanks for your patience.

avarga commented 7 months ago

I just discovered that the bool checkMediaRedundancy parameter of Mrp has no implementartion. The code just reads it into the class member of the same name, and doesn't use it. Now the question is whether it should be (A) implemented, or (B) removed. The spec says on p41, "Check Media Redundancy: This attribute specifies whether monitoring of MRM state is enabled (TRUE) or disabled (FALSE) in the redundancy domain.", also pn p44 and p50: "Check Media Redundancy: This parameter selects whether monitoring of MRM state is enabled or disabled."

Do I understand correctly that this is sort of the main switch to turn on/off MRP as a whole? Then I think the parameter can be removed (option B). If someone wants to turn off a MRP in a switch, they can remove the mrp submodule from the switch via NED or ini configuration.

avarga commented 7 months ago

The Mrp module has a parameter called noTopologyChange. Does this correspond to the NO_TC variable in the spec, described in "Table 40 – MRP Local variables of MRM protocol machine" on on p70 as with the comment "Suppress MRP_TopologyChange while in line topology"?

If so, then it is supposed to be an internal variable of Mrp, not a configurable parameter. A search for NO_TC in the spec reveals that various functions set it to TRUE or FALSE under various conditions, so it cannot be a configurable parameter.

DJtime commented 7 months ago

Sorry for my longer period of silence. But I needed some space from MRP for a while after one year of deep dive.

"I was wondering that maybe instead of the two boolean parameters there should be a single interconnectionMode parameter with the allowed values UNDEFINED, LC_MODE, and RC_MODE, as described in the list on page 42. What do you think?" --> As it does not very much sense to have both enabled, it could help to prevent accidental misconfiguration

checkMediaRedundancy: I have always interpreted it as a way to enable the active Testing of the MRP ring.... So if disabled, only local link detection is done. But this one of the many points where MRP is not very specific.... I have not found any point in spec where this variable is referenced.

noTopologyChange: Yes it corresponds to the internal variable and therefore the parameter should be removed from NED file

I am currently reviewing the changes you already have done and will check the model further.

avarga commented 7 months ago

No worries, I really appreciate your continued support! My apologies for constantly rebasing/editing the last few commits of the branch, I am doing it in order to create a cleaner change history where my mistakes are easier to spot.

avarga commented 7 months ago

My commits are mostly stylistic changes, they do not change the operation of the code. I took care that the fingerprints are unchanged throughout all my commits (only LargeNet model fingerprints change at one point, because I changed the network.)