openETCS / toolchain

WP7: Top Level Project for the toolchain
26 stars 30 forks source link

Feedback on requirement metadata attribution #479

Closed morido closed 9 years ago

morido commented 9 years ago

Hello everyone,

for my subset026 requirement importer I recently added a few (currently very simple) algorithms to detect different attributes of requirements. Namely:

Kind (Enumeration)

  1. Detect if the requirement is a Heading (testHeading())
  2. Detect if the requirement is a Note (testNote())
  3. Detect if the requirement is deleted (testPlaceholder())
  4. Detect if the requirement is a Justification (testJustification())
  5. Detect if the requirement is a Definition (testDefinition())

Note: Requirements marked as Exceptions currently do not receive any special treatment. Tell me if you need that as well.

Atomicity (Boolean)

  1. Detect if the requirement may be regarded as atomic (testAtomicity())

    Legal Obligation (Enumeration)

  2. Detect the different obligations (shall/may/...) a requirement may have (testLegalObligation())

    Vagueness (List of Words)

  3. Detect if the requirement text contains weak words like "temporarily", "immediately", "at least", "some", ... (testVagueness())

Each of those groups is (mostly) independent (i.e. a requirement may be of type Definition, not be atomic and have a mandatory legal obligation -- this may sound stupid at first, but see for yourself)


What I would like you to do

Please review my test file and tell me if you do not agree with any of the test strings and their respective metadata attribution. If there is no feedback I assume everything is fine.

I hope everyone is able to read jUnit test files. If not give me a shout. Essentially all you need to look for is the instances of TestDatum() in step 1 of each test and then check if those strings are a good fit for the respective attribute (hint: sometimes there is counterexamples, too -- those have matchExpected unset).

BTW: Feel free to send me other examples of requirements that fit into any of the aforementioned categories. More training data is always welcomed.

MatthieuPERIN commented 9 years ago

Hi, if your code is Java related maybe the form isHeading() / isNote() is more suitable ?

For the legal Obligation, if "may" and "must" both appear in the requirement how do your code react ?

morido commented 9 years ago

Hi, if your code is Java related maybe the form isHeading() / isNote() is more suitable ?

What exactly are you referring to? Method names? The linked gist only contains jUnit test code. Its methods all obey the same naming convention which simply states that each method name must start with test. Prepending is instead would imply a boolean return value. However, jUnit tests never return anything.

For the legal Obligation, if "may" and "must" both appear in the requirement how do your code react ?

This is unlikely to happen since must is not a legal keyword in our specs. I suppose you actually mean shall. See line 292. It would attribute a MIXED legal obligation.

MatthieuPERIN commented 9 years ago

Sorry I do not have understood is was JUnit tests ! For sure your naming using test prefix is obviously right !

I think the MIXED attribute is a good idea, thanks for the answer !

morido commented 9 years ago

Ok, sorry if I was not expicit enough.

So once again: All you see is jUnit tests (hence the @Test annotation everywhere). I do not want to bother you with implementation details. All I need is a second opinion on my test data. I.e. do you agree with my metadata attribution or not.

MERCEmentre commented 9 years ago

Some comments:

morido commented 9 years ago

You consider "moved" as "deleted"?

Yes. Effectively the deleted-attribute turns the respective tracestring (which technically is a pointer to a specific position and not to a specific text in the document) into a "dead end". And for that I consider it not important why this happened (the requirement is deleted, has moved, never existed in the first place, ...). So eventually one could write a checker somewhere downstream in the toolchain which iterates over all referenced requirements and then emits an error if one of them is deleted. Thus, it would force the implementer to reference the requirement text at its new position instead (wherever this may be - the specs do not tell. If they would I could build a much smarter algorithm...).

Does that make sense to you?

MERCEmentre commented 9 years ago

Yes, it makes sense.

Other remark:

Otherwise, no specific remark. The Vagueness tests are fun to read... and frightening in retrospective!

morido commented 9 years ago

Would you consider doable to add EXTERNAL obligation, e.g. for https://gist.github.com/morido/fe9e08252256e7137382#file-metadatadeterminertest-java-L294 ?

It should not be too difficult to implement that. For what do you need it and when should it trigger?

But be aware: This would not affect the legal obligation in the above case, which would still remain "UNKNOWN" simply because the sentence neither contains "shall" nor "may" ("must" is not a legal keyword in our specs).

Why do you consider this case vague? https://gist.github.com/morido/fe9e08252256e7137382#file-metadatadeterminertest-java-L341

Because of "defined time" which is never actually defined... There will be a field in the ReqIF-output which highlights those keywords in the respective sentences. So it should become more obvious.

I would not describe this case as vague, the precise exceptions are given in reference: https://gist.github.com/morido/fe9e08252256e7137382#file-metadatadeterminertest-java-L342

Ok, so you are saying the requirement basically states "Appendix A3.4 shall be implemented"? Too bad the requirement is a lot more wordy and those words look so terribly vague... Seriously: This should be resolved manually. I believe it is perfectly fine for an algorithm to flag this.

MERCEmentre commented 9 years ago

Hello Moritz,

Le 14/02/2015 22:36, Moritz Dorka a écrit :

It should not be too difficult to implement that. For what do you need it and when should it trigger?

The main idea is to provide a way to collect all external requirements, i.e. requirements that cannot be fulfilled by the considered system. It would be very useful to define requirements on the environment of the system, to be checked later. But I'm probably a bit too optimistic. ;-)

But be aware: This would not affect the legal obligation in the above case, which would still remain "UNKNOWN" simply because the sentence neither contains "shall" nor "may" ("must" is not a legal keyword in our specs).

OK, I understand.

Why do you consider this case vague?
https://gist.github.com/morido/fe9e08252256e7137382#file-metadatadeterminertest-java-L341

Because of "defined time" which is never actually defined...

Right, well spotted!

There will be a field in the ReqIF-output which highlights those keywords in the respective sentences. So it should become more obvious.

OK, cool!

I would not describe this case as vague, the precise exceptions are
given in reference:
https://gist.github.com/morido/fe9e08252256e7137382#file-metadatadeterminertest-java-L342

Ok, so you are saying the requirement basically states "Appendix A3.4 shall be implemented"? Too bad the requirement is a lot more wordy and those words look so terribly vague... Seriously: This should be resolved manually. I believe it is perfectly fine for an algorithm to flag this.

I agree.

Thanks! david

morido commented 9 years ago

So what we know by now: Legal Obligation and External Reference are two entirely unrelated properties. I.e. a requirement may reference an external entity irrespectively of whether it is mandatory/optional/whatever.

The main idea is to provide a way to collect all external requirements, i.e. requirements that cannot be fulfilled by the considered system. It would be very useful to define requirements on the environment of the system, to be checked later.

Can you name any good examples of such requirements? On that basis I might be able to craft some detection algorithm. And could you elaborate a little more on the phrase "to be checked later"? What do you actually want to know? If a given chapter / set of requirements contains external references? A list of those referenced external entities? A connection between the presence of an external reference and the legal obligation of the respective requirement? ...?

MERCEmentre commented 9 years ago

Hello Moritz,

Sorry for the late reply.

I don't know if it worth for you to invest time in developing in more details your algorithm, even I do see an interest at classifying external requirements.

Le 20/02/2015 13:37, Moritz Dorka a écrit :

So what we know by now: /Legal Obligation/ and /External Reference/ are two entirely unrelated properties. I.e. a requirement may reference an external entity irrespectively of whether it is mandatory/optional/whatever.

Yes.

The main idea is to provide a way to collect all external
requirements, i.e. requirements that cannot be fulfilled by the
considered system. It would be very useful to define requirements on
the environment of the system, to be checked later.

Can you name any good examples for such requirements? On that basis I might be able to craft some detection algorithm.

§3.5.4.2: "When EURORADIO indicates the loss of the safe radio connection, the ERTMS/ETCS on-board equipment shall immediately try to set-up a new safe radio connection."

=> EURORADIO is not part of on-board EVC, this requirement is related to an external part.

§3.6.4.3.1 "Justification: it is always the trackside responsibility to provide linking in due course, knowing this rule; if the location related information is to be used in situations where linking is not provided (e.g. TSR transmitted by balise group marked as unlinked), the trackside can include provisions, if deemed necessary, when engineering the distance information."

=> Requirement induced on trackside.

§3.6.6.1 "The ERTMS/ETCS on-board equipment shall display, only on driver request, the geographical position of the estimated front end of the train in relation to the track kilometre. The display of the geographical position shall also be stopped on driver request."

=> Requirement on DMI.

§3.7.1.1 "To control the train movement in an ERTMS/ETCS based system the ERTMS/ETCS on-board equipment shall be given information from the trackside system both concerning the route set for the train and the track description for that route. The following information shall be given from the trackside [...]"

=> requirement on trackside.

§3.7.2.2 "The trackside shall be responsible for that the on-board equipment has received the information valid for the distance covered by the Movement Authority."

=> requirement on trackside.

And could you elaborate a little more on the phrase "to be checked later"? What do you actually want to know? If a given chapter / set of requirements contains external references?

More importantly, a list of external requirements per chapter / set of requirements.

A list of those referenced external entities?

Yes, a list of those referenced external entities and requirements naming them.

A connection between the presence of an external reference and the legal obligation of the respective requirement? ...?

It would be also useful.

In my view, SUBSET-026 contains requirements for on-board EVC but also entities external to EVC. The correct working of the system is ensured when all those requirements are fulfilled. So if I develop an ERTMS system, I need to keep track of those requirements and assign them to corresponding subsystem (DMI, EVC, trackside, ...).

Best regards, david

morido commented 9 years ago

Thanks for the examples. Sounds like a nice challenge... Generally: Yes, I do have the intention to implement such detection algorithms. So we should at least try :-)

What I could certainly do is to extract all [compound-] nouns from all requirements of a certain chapter, filter them (i.e. some sort of blacklisting for common English words) and then see what remains. Does that sound promising to you? Admittedly, his will not directly target external entities. But it should return a superset of those and maybe it is feasible to do the remaining filtering by hand?

MERCEmentre commented 9 years ago

Filtering: honestly, I don't know. :-)

Filtering by hand: your algorithms are not perfect anyway, so manual handling is needed. The only questions is then what to do for an new version of the SUBSET? Process all new/changed paragraphs automatically (i.e. with your algorithm) and then manually? Only manually?

morido commented 9 years ago

Filtering by hand: your algorithms are not perfect anyway,

How would you know? You haven't seen them, yet... :-)

The only questions is then what to do for an new version of the SUBSET?

I am using heuristic approaches with a touch of NLP. So new/altered texts are not a problem as long as they still match my heuristic expectations.

MERCEmentre commented 9 years ago

Le 04/03/2015 12:23, Moritz Dorka a écrit :

How would you know? You haven't seen them, yet... :-)

:-)

The only questions is then what to do for an new version of the SUBSET?

I am using heuristic approaches with a touch of NLP. So new/altered texts are not a problem as long as they still match my heuristic expectations.

OK, good to know!

Best regards, david

morido commented 9 years ago

Here is an example of how external entities are currently displayed in ProR (after they have been detected by my tool). Each of those "boxes" can be queried for (although a nice GUI for such queries is currently missing from ProR). @MERCEmentre Does that help you in any way?

implementerenhanced

jastram commented 9 years ago

Considering that @morido concluded his work, I will close this issue.

MERCEmentre commented 9 years ago

Just for the record, I found @morido last work quite interesting.

RobertoKretschmer commented 9 years ago

I will be out of the office starting 05.06.2015 and will not return until 20.06.2015

Ich werde mich sobald wie m�glich bei Ihnen melden. Bei dringenden Angelegenheiten wenden Sie sich bitte an: Carsten.Kuebler@twt-gmbh.de

I will get back to you as soon as possible. In case of urgent matters please contact: Carsten.Kuebler@twt-gmbh.de

Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht  "Re: [toolchain] Feedback on requirement metadata attribution (#479)" gesendet am 09.06.2015 13:58:58.

Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w�hrend diese Person abwesend ist.