Closed kam193 closed 5 months ago
Hmm... I wonder if it has anything to do with: https://github.com/CybercentreCanada/assemblyline-base/blob/90ba51e38c3e0e3bf01c8b1f4541902a89208ed0/assemblyline/odm/models/config.py#L1903
At least based on the last line of the error and that was the only new thing added to the Config ODM in the new release. Surprised I haven't seen this..
Could it be perhaps something related to your configuration? I just checked our logging stack and found no mention of the error, so I suspect it could be a parsing error for the metadata enforcement configuration that's causing the error to be raised.
I would suspect to see this error in the other core containers as well since they should all call forge.get_config()
It may be, I have indeed tried to configure it, and for some reasons I didn't try the upgrade without configuring metadata. But the core containers didn't have any such problem, just privileged services and update containers. Anyway, I'll check it and come back, thanks for the tip
The current state of debugging:
submission
key from the config at all & re-updated AL, tried to update any service - no success.Now I'm looking into what exactly causes the failure.
And we have the winner - what is, indeed, the config for metadata. However, this is not what core services see. They see the original configuration I wrote, but all privileged services, including the container for updating, gets the configuration prepared by the scaler. It looks like the scaler doesn't understand dumping it well - here is what it produces:
max_temp_data_length: 4096
metadata:
archive: {}
ingest:
INGEST: !!python/object:assemblyline.odm.base.TypedMapping <----- Line 451, as in the traceback
index: false
sanitizer: !!python/object/apply:re._compile
- ^[A-Za-z0-9_ -.]*$
- 32
store: false
type: !!python/object:assemblyline.odm.base.Compound
ai: true
child_type: !!python/name:assemblyline.odm.models.config.Metadata ''
copyto: []
default: null
default_set: false
deprecation: null
description: null
getter_function: null
index: false
multivalued: false
name: null
optional: false
parent_name: null
setter_function: null
store: false
submit: {}
I'm not entirely sure, but it looks like a dump of the TypedMapping
object itself, not the data. Setting any configuration or removing the INGEST
key doesn't help either. It looks like one of differences between docker compose and kubernetes deployment, so you might not see it.
It also explains why my manual update worked - I didn't mount the prepared configuration, and things necessary for the service registration are still equal to defaults (yeah, I know they shouldn't, but it's still a hobby setup ;)), so it worked without the real config file.
Hmm... so it would stand to reason that if you were to shell into the Scaler container, then running:
from assemblyline.common import forge
forge.get_config().as_primitives()
Should yield the same error/garbage output?
Not exactly, but it does not return really primitives. Let's have a look, all in the scaler container of the AL v27:
>>> c = forge.get_config().as_primitives()
>>> c["submission"]
{'default_max_extracted': 500, 'default_max_supplementary': 500, 'dtl': 30, 'emptyresult_dtl': 5, 'max_dtl': 0, 'max_extraction_depth': 6, 'max_file_size': 524288000, 'max_metadata_length': 4096, 'max_temp_data_length': 4096, 'metadata': {'archive': {}, 'submit': {}, 'ingest': {'INGEST': {}}}, 'sha256_sources': [], 'file_sources': [], 'tag_types': {'attribution': ['attribution.actor', 'attribution.campaign', 'attribution.exploit', 'attribution.implant', 'attribution.family', 'attribution.network', 'av.virus_name', 'file.config', 'technique.obfuscation'], 'behavior': ['file.behavior'], 'ioc': ['network.email.address', 'network.static.ip', 'network.static.domain', 'network.static.uri', 'network.dynamic.ip', 'network.dynamic.domain', 'network.dynamic.uri']}, 'verdicts': {'info': 0, 'suspicious': 300, 'highly_suspicious': 700, 'malicious': 1000}}
>>> # ^ Looks okay
>>> d = yaml.dump(c)
>>> "!!python/object:assemblyline.odm.base.TypedMapping" in d
True
>>> d
' ... \n metadata:\n archive: {}\n ingest:\n INGEST: !!python/object:assemblyline.odm.base.TypedMapping\n index: false\n sanitizer: !!python/object/apply:re._compile...'
>>> # But dumped wrongly
>>> type(c)
<class 'dict'>
>>> type(c["submission"])
<class 'dict'>
>>> type(c["submission"]["metadata"])
<class 'dict'>
>>> type(c["submission"]["metadata"]["ingest"])
<class 'dict'>
>>> type(c["submission"]["metadata"]["ingest"]["INGEST"])
<class 'assemblyline.odm.base.TypedMapping'>
>>> # ^ Because it's not a dict
I think I know a solution to fix that 😁
You can try testing with the 4.5.1.dev166 release to make sure there aren't any other issues that 4.5.0.27 introduced for the Docker deployments.
I confirm with this release updating works and services look like functioning as well, thanks!
BTW I love the re-design, when does it come to the stable branch?
I can include the patch in the next stable, once the PR is approved!
The update to the iconography was included in the 4.5.0.28 release we pushed yesterday 😁
Thanks, I've just installed it and can confirm it works :)
Hey, I think we both didn't test the service properly... I run it with the default configuration, but not with really configured metadata. When I played with metadata, the scaler service wasn't happy again:
{"@timestamp": "2024-06-04 16:53:41,417", "event": { "module": "assemblyline", "dataset": "assemblyline.scaler" }, "host": { "ip": "x.x.x.x", "hostname": "2a55612146dd" }, "log": { "level": "INFO", "logger": "assemblyline.scaler" }, "process": { "pid": "1" }, "message": "Found the service server at: b1d0d55f6dc96d9942b27fc5dfd5060e5e5cfaca41e17b24a58faf5c425c7e84 [assemblyline-service_server-1]"}
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_core/scaler/run_scaler.py", line 6, in <module>
with ScalerServer() as scaler:
^^^^^^^^^^^^^^
File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_core/scaler/scaler_server.py", line 365, in __init__
yaml.dump(json.loads(self.config.json()), handle)
^^^^^^^^^^^^^^^^^^
File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/odm/base.py", line 1403, in json
return json.dumps(self.as_primitives())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 258, in iterencode
return _iterencode(o, 0)
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 180, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Metadata is not JSON serializable
However, it looks to be a problem with the ingest
metadata declaration (which I don't need anyway, so it's not a big issue for me), because the following configuration fails:
...
submission:
metadata:
submit:
python.package_name:
validator_type: text
required: false
python.uploader:
validator_type: text
required: false
ingest:
INGEST:
python.package_name:
validator_type: text
required: false
but when I remove the ingest
key entirely, the scaler service is working again. It looks like either I misunderstood the expected config syntax, or the as_primitives
does not handle the ingest
definition properly
Describe the bug After upgrading to 4.5.0.27, AssemblyLine isn't able to upgrade any service. However, the upgrade looks successful if I tried to start a container with the new service image manually. I see a couple of errors from starting containers (about ODM schema), but I'm unable to say if they come from the update container (I guess they are not - I see errors from update and privileged services, even if I try to upgrade a regular service). My suspicious is, that for some reason the updating thread is silently failing before even starting the container.
Have you seen anything like this, or should I try to add some additional logs/try-except in the updater container?
To Reproduce Steps to reproduce the behavior:
This is an example log from the updater, here for Swiffer (I tried with a few different services):
If I tried a following command (for Sigma which also couldn't be updated by AL, & I reused the network created by AL for the Safelist):
It's successful and AL recognize the new version.
Expected behavior Services are upgraded
Screenshots
Environment (please complete the following information if pertinent):
Additional context Here an example from errors I see in logs, but I think they should be gone after service upgrade and are unrelated: