Closed landauermax closed 3 years ago
I have reproduced the results and came to the conclusion that there is a configuration error, no bug. ALLOW_ALL is not intended to be used for simple strings - instead only full objects and lists can be parsed with ALLOW_ALL.
I have changed the example config as follows:
- id: json
start: True
type: JsonModelElement
name: 'model'
key_parser_dict:
"aa": a
fields:
"bb": b
"cc":
- ALLOW_ALL
"dd": d
Using ALLOW_ALL in the cc list works as intended with following results (note: I have already fixed the issue with the paths):
2021-07-12 12:18:38 New path(es) detected
NewMatchPathDetector: "DefaultNewMatchPathDetector" (1 lines)
{
"AnalysisComponent": {
"AnalysisComponentIdentifier": 1,
"AnalysisComponentType": "NewMatchPathDetector",
"AnalysisComponentName": "DefaultNewMatchPathDetector",
"Message": "New path(es) detected",
"PersistenceFileName": "Default",
"AffectedLogAtomPaths": [
"/model",
"/model/a",
"/model/fields/b",
"/model/fields/cc",
"/model/fields/d"
],
"ParsedLogAtom": {
"/model": {
"aa": "a1",
"fields": {
"bb": "b1",
"cc": [
"c1"
],
"dd": "d1"
}
},
"/model/a": "a1",
"/model/fields/b": "b1",
"/model/fields/cc": "c1",
"/model/fields/d": "d1"
}
},
"LogData": {
"RawLogData": [
"{\n \"aa\": \"a1\",\n \"fields\": {\n \"bb\": \"b1\",\n \"cc\": [\n \"c1\"\n ],\n \"dd\": \"d1\"\n }\n}"
],
"Timestamps": [
1626085118.79
],
"DetectionTimestamp": 1626085118.79,
"LogLinesCount": 1,
"AnnotatedMatchElement": "/model: {'aa': 'a1', 'fields': {'bb': 'b1', 'cc': ['c1'], 'dd': 'd1'}}\n /model/a: a1\n /model/fields/b: b1\n /model/fields/cc: c1\n /model/fields/d: d1"
}
}
Okay, I was not aware of that. And I think it is not obvious that ALLOW_ALL can only be used in that way. So either we extend ALLOW_ALL to also work for strings (and any other data, i.e., have the same functionality as a AnyByteDataModelElement), or we make sure that unparsed atoms are generated when something that is not a string or object occur on an ALLOW_ALL field. What do you prefer?
I have implemented the second option - only lists and objects are allowed. There is absolutely no reason to use ALLOW_ALL on simple strings and we should let the user know. There is always a possiblity to parse a string.
I have the following sample json:
And I use the following config:
Note that each field is mapped with an element. As expected, each field is correctly represented in the output:
However, if I change the element of field "bb" to ALLOW_ALL, i.e.,
then the ouput changes as follows:
Now the value "b1" is stored in "/model/fields", which is not intuitive. Since there is no user-defined name for an element when using ALLOW_ALL, we could store these values in "/model/fields/allow_all", where is a counter, or maybe in "/model/fields/", e.g., "/model/fields/bb" in this case.
And what is even more strange is that even though "c1" is still correctly stored in "/model/fields/c", the value "d1" and the path "/model/fields/d" have disappeared from the parsed element (see AffectedLogAtomPaths). I assume this is a bug that needs to be fixed.