ankraft / ACME-oneM2M-CSE

An open source CSE Middleware for Education.
https://acmecse.net/
BSD 3-Clause "New" or "Revised" License
23 stars 16 forks source link

Fixed error when issuing semantic queries #149

Closed Luke1734 closed 5 months ago

Luke1734 commented 5 months ago

The issue I have is around issuing semantic queries. Following the instructions in the SAREF OneM2M implementation example document provided by ETSI at:

https://www.etsi.org/deliver/etsi_ts/103700_103799/103780/01.01.01_60/ts_103780v010101p.pdf

the example in section 8.3.3 suggests the format of the semantic query http request to be:

'http://localhost:50000/oneM2M-semantics?fu=1&sqi=1&smf='+ encSMQ

Modifying this slightly gives the URL which I used of:

http://localhost:8080/cse-in?fu=1&sqi=1&smf=’ + encSMQ

(encSMQ being the base64 encoded semantic query as per spec)

When using the ACME CSE an error is given:

{'m2m:dbg': 'invalid arguments/attributes: sqi request attribute is only allowed for RETRIEVE/DISCOVERY operations'}

This is because of the use of fu=1 in the URL indicating that this is a discovery operation – which I believe to be correct to the spec – however in the file acme/services/RequestManager.py there is a check made for this fu=1 parameter and if so, the cseRequest.op is updated to equal Operation.DISCOVERY instead of Operation.RETRIEVE.

This check is done on the lines (1320-1321):

if cseRequest.fc.fu == FilterUsage.discoveryCriteria and cseRequest.op == Operation.RETRIEVE:   # correct operation if necessary
    cseRequest.op = Operation.DISCOVERY

This then means when a check is made further down in the code for the semantic query indicator, there is an if statement that throws an error should the semantic query indicator be set but the operation not be Operation.RETRIEVE (which it now isn’t due to the above update)

However, simply excluding this fu=1 parameter in the request does not fix the issue as now incorrect default rcn attributes are assigned for a retrieve operation, which this is not, and hence the code fails with a different error:

{'m2m:dbg': 'invalid arguments/attributes: rcn: attributes not allowed in DISCOVERY operation'}

One fix which I have been using is to change the code in the RequestManager.py file from:

if cseRequest.op != Operation.RETRIEVE:
      raise BAD_REQUEST(L.logDebug('sqi request attribute is only allowed for RETRIEVE/DISCOVERY operations'), data = cseRequest)
 else:
      cseRequest.sqi = v
      cseRequest.op = Operation.DISCOVERY

To:

if cseRequest.op not in [Operation.RETRIEVE, Operation.DISCOVERY]:
      raise BAD_REQUEST(L.logDebug('sqi request attribute is only allowed for RETRIEVE/DISCOVERY operations'), data = cseRequest)

else:
      cseRequest.sqi = v
      cseRequest.op = Operation.DISCOVERY
      rcn = ResultContentType.semanticContent

This solution allows for the request to be made with or without the fu=1 parameter since in the case it is supplied, then cseRequest.op is updated to Operation.DISCOVERY as described above, and if it is not, then it will still be Operation.RETRIEVE at this point and the code here will update it in the else clause.

There is also the addition of the final line rcn = ResultContentType.semanticContent which updates the rcn to the correct value for a semantic query such that the second error of the rcn attributes not being allowed in discovery is fixed.

ankraft commented 5 months ago

Thank you for the PR and the very extensive explanation.

Reading the example in the ETSI document: I think that the use of fu=1 in the request is not correct since the request is, when sent by the originator, a RETRIEVE request (see the definition for Semantic Query Indicator in TS-0001, clause 8.1.2).

However, the first change makes total sense.

I am not so happy with the second change, though. Yes, rcn = semantic-content is the only rcn that makes sense here, and it should be the default when this is a semantic query, but unfortunately, TS-0001, table Table 8.1.2-1 doesn't say so (not even in a note :( ). This means that the CSE should not "repair" a missing or wrong rcn here.
Also, semanticContent is an allowed rcn for DISCOVERY (see Types.py, line 1102).
Or am I missing something here?

If you agree, could you change the PR and remove the second change? Thanks a lot!

Luke1734 commented 5 months ago

I agree with your stance that using fu=1 in the request as being unnecessary based upon the definition of the semantic query indicator you referenced. I was using this as it is a part of the example usage given in ETSI TS 103 780 section 8.3.3 but perhaps that document is outdated or there is something we are missing.

I have looked in detail at the documentation and specifically at table 8.1.2-1. Within this it is specified that the semantic-content rcn is only valid for Retrieve and not Retrieve (filterUsage='discovery'). From this I have added a change to the Types.py file to reflect this and have reverted my change for the semantic query indicator check to allow both Retrieve and Discovery operations. This is further supported by table 8.1.2-2, the condition tag, semanticsFilter, which has the following description:

Both semantic resource discovery and semantic query use semanticsFilter to specify a query statement that shall be specified in the SPARQL query language [5]. When a CSE receives a RETRIEVE request including a semanticsFilter, and the Semantic Query Indicator parameter is also present in the request, the request shall be processed as a semantic query; otherwise, the request shall be processed as a semantic resource discovery. In the case of semantic resource discovery targeting a specific resource, if the semantic description contained in the of a child resource matches the semanticFilter, the URI of this child resource will be included in the semantic resource discovery result. In the case of semantic query, given a received semantic query request and its query scope, the SPARQL query statement shall be executed over aggregated semantic information collected from the semantic resource(s) in the query scope and the produced output will be the result of this semantic query.

The key takeaway being that the semantic query indicator being present indicates a semantic query, and thus a RETRIEVE operation, otherwise if the smf parameter is present without the sqi being set to True, it must be a DISCOVERY operation, hence the sqi check for only RETRIEVE operations was correct, but it was the incorrect assignment of result content types for operations in the Types.py file partially causing the issue.

As for your point around the CSE updating the rcn type itself - I would argue that this is the intended behaviour from the following description of the semantic-content rcn type in section 8.1.2:

semantic-content: Representation of semantic information that is the result of a semantic query as indicated by the setting of the Semantic Query Indicator parameter. Note that for any of the above options, Discovery access control is applied against discovery related procedures, while Retrieve access control procedures is applied against non-discovery related Retrieve operations. Note that the fitter criteria usage governs the purpose of a Retrieve operation.

a mention is made that the use of the semanticContent result content type is

indicated by the setting of the Semantic Query Indicator parameter.

My interpretation of this is that it is down to the CSE to set the rcn based off the use of the sqi parameter being set to True. Though perhaps I have misunderstood. Notably all works well if rcn=10 is provided in the request (rcn=10 being for ResultContentType.semanticContent) - so it is whether it is down to the user to provide that parameter in their request - which again the example usage from ETSI TS 103 780 did not require, however perhaps this is another mistake in that document, or a misinterpretation on my end.

One final change I have made is within the RequestManager.py file (code snippet below). Currently, a check is made that enforces sqi and smf being specified together, but the above quotes from the documentation show that there can be a scenario where smf is provided without sqi. In this case, the semantic function is used to match against the semantic content contained within semantic descriptor resources, returning a list of URI's for the semantic descriptors that match the smf provided.

if [ cseRequest.fc.smf is not None, cseRequest.sqi is not None ].count(True) not in [ 0, 2 ]:
    raise BAD_REQUEST(L.logDebug('sqi and smf must be specified together'), data = cseRequest)

Thus I believe this to be an invalid check and have thus removed it. The other checks below it (see below) also suggest a scenario where cseRquest.sqi can be None and be valid for the semantic resource discovery case further supporting the rationale behind this change:

if cseRequest.sqi is not None and not cseRequest.sqi and cseRequest.rcn != ResultContentType.discoveryResultReferences:
    raise BAD_REQUEST(L.logDebug('Wrong ResultContentType for sqi == False (must be discoveryResultReferences)'), data = cseRequest)