Closed HUSMUS9999 closed 5 months ago
idk what happens here, but i would suggest you to check the machinery tags, if there is bad char or similar thing
I have the same problem (using Azure with scaleset pool tags), @HUSMUS9999 could you please print the content of the following tables :
tasks_tags; (should be empty but who knows)
For what i understand there is a problem in the submission related code where the tag id get's a +1, this causes the tags ids to rise (as you can see in your prints) with the same value as the tasks causing the obvious relation problem.
Following up to the previous comment, these are my DB tables related to this issue. I'll give some context here :
my az.conf
:
[Sandbox-Cape-VMSS-1]
gallery_image_name = Sandbox-Cape-Image-Definition-v8
platform = windows
arch = x64
#tags = x64
pool_tag = x64
initial_pool_size = 1
2024-06-21 09:34:32,203 [modules.machinery.az] DEBUG: Sandbox-Cape-VMSS-1_0: Initializing...
2024-06-21 09:34:42,245 [modules.machinery.az] DEBUG: Machine Sandbox-Cape-VMSS-1_0 was created and available in 104s
2024-06-21 09:34:42,475 [lib.cuckoo.core.machinery_manager] DEBUG: SFSG : available machines : [<Machine(1,'CSS-Sandbox-Cape-VMSS-1_0')>]
2024-06-21 09:34:42,476 [lib.cuckoo.core.machinery_manager] INFO: Loaded 1 machine
2024-06-21 09:34:42,496 [lib.cuckoo.core.machinery_manager] INFO: max_vmstartup_count for BoundedSemaphore = 5
id | name | label | arch | ip | platform | interface | snapshot | locked | locked_changed_on | status | status_changed_on | resultserver_ip | resultserver_port | reserved | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Sandbox-Cape-VMSS-1_0 | Sandbox-Cape-VMSS-1_0 | x64 | 10.3.0.4 | windows | Sandbox-Cape-VMSS-Subnet-nic01 | /subscriptions/ |
f | 10.0.6.7 | 2042 | f |
id | name |
---|---|
1 | x64 |
machine_id | tag_id |
---|---|
1 | 1 |
task_id | tag_id |
---|---|
id | target | category | cape | timeout | priority | custom | machine | package | route | tags_tasks | options | platform | memory | enforce_timeout | clock | added_on | started_on | completed_on | status | dropped_files | running_processes | api_calls | domains | signatures_total | signatures_alert | files_written | registry_keys_modified | crash_issues | anti_issues | analysis_started_on | analysis_finished_on | processing_started_on | processing_finished_on | signatures_started_on | signatures_finished_on | reporting_started_on | reporting_finished_on | timedout | sample_id | machine_id | shrike_url | shrike_refer | shrike_msg | shrike_sid | parent_id | tlp | user_id | username |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
So tasks_tags and tasks are obviously empty due to the error reported by OP.
At this point when submitting a new task cape executes this insert :
[SQL: INSERT INTO tasks_tags (task_id, tag_id) VALUES (%(task_id__0)s, %(tag_id__0)s), (%(task_id__1)s, %(tag_id__1)s)]
where in my test the parameter were set to :
[parameters: {'task_id__0': 1, 'tag_id__0': 1, 'task_id__1': 1, 'tag_id__1': 2}]
As you can see even with a clean DB the insert tries to use two sets of task_id and tag_id :
task_id__0
& tag_id__0
task_id__1
& tag_id__1
Incrementing tag_id__1
from 1 to 2, even with 2 tasks submitted this behaviour would likely result in a DB error due to the fact that the value incremented is the tag_id and not the task_id.
At this point I don't actually know where this action is performed inside the code neither why so I'll have to keep digging until it makes sense.
@doomedraven @cccs-mog @tbeadle @cccs-kevin Since you guys have clearly more experience with the new logics of the machinery module could you please correct my hypothesis in case I got something wrong and follow up with the right logic when adding a task in the relation table tasks_tags ?
why do you comment out #tags = x64
? that might be problem, idk i don't have azure
i tried to use tags instead of pool_tags in a previous test, wasn't successfull tho (my bad I had to read the code). Anyway it's presence doesn't change anything i made some previous tests without it and i still get that issue.
The configurations i tried are the following :
arch = x64
tags = x64
pool_tag = x64
arch = x64
tags = x64,x86
pool_tag = x64
arch = x64
#tags = x64
pool_tag = x64,x86
And the final one that i saw in the previous post, none of these configurations prevented the tag_id to be incremented like I pointed out before
So your problem is with azure right?
No, the problem is not directly related to azure.
I found a way to bypass task tag relation in database.py by commenting out the section of the code in the add()
function that deals with tags format :
task.cape = cape
task.tags_tasks = tags_tasks
# Deal with tags format (i.e., foo,bar,baz)
if tags:
for tag in tags.split(","):
tag_name = tag.strip()
if tag_name and tag_name not in [tag.name for tag in task.tags]:
#if tags:
# for tag in tags.split(","):
# tag_name = tag.strip()
# if tag_name and tag_name not in [tag.name for tag in task.tags]:
# "Task" object is being merged into a Session along the backref cascade path for relationship "Tag.tasks"; in SQLAlchemy 2.0, this reverse cascade will not take place.
# Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag
# (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
task.tags.append(self._get_or_create(Tag, name=tag_name))
# task.tags.append(self._get_or_create(Tag, name=tag_name))
if clock:
if isinstance(clock, str):
but of course this is not a solution to the problem, I'm trying to understand if during the for loop the tag id gets wrongly incremented.
Could someone follow up with the actual expected output of this section ?
This may or may not help, but here is my VMSS configuration in az.conf
:
[vmss-dev-cape-win10x64]
gallery_image_name = win10x64-cape
platform = windows
arch = x64
pool_tag = win10x64
Setting pool_tag
== arch
is odd, and may be the issue
This may or may not help, but here is my VMSS configuration in
az.conf
:[vmss-dev-cape-win10x64] gallery_image_name = win10x64-cape platform = windows arch = x64 pool_tag = win10x64
Setting
pool_tag
==arch
is odd, and may be the issue
I'll try it right away
Well, it didn't work as expected :
This is the dump of what return self.add() has when submitting a file :
cape | ''
-- | --
clock | '06-21-2024 12:50:31'
custom | ''
enforce_timeout | False
file_md5 | '2401c281f6798633b66b2a4a14937354'
file_type | ('Composite Document File V2 Document, Little Endian, Os: Windows, Version ' '6.3, MSI Installer, Code page: 1252, Title: Installation Database, Subject: ' 'Skype Meetings App, Author: Microsoft Corporation, Keywords: Installer, ' 'Comments: This installer database contains the logic and data required to ' 'install Skype Meetings App., Template: Intel;0, Revision Number: ' '{C6C0F413-901C-42A8-A7F1-D03BD40F9B12}, Create Time/Date: Sat Aug 3 ' '05:00:26 2019, Last Saved Time/Date: Sat Aug 3 05:00:26 2019, Number of ' 'Pages: 300, Number of Words: 10, Name of Creating Application: Windows ' 'Installer XML Toolset (3.11.1.2318), Security: 2')
fileobj | <lib.cuckoo.common.objects.File object at 0x78d9d998c280>
machine | None
memory | False
obj | <lib.cuckoo.common.objects.File object at 0x78d9d998ccd0>
options | ''
package | 'msi'
parent_id | None
platform | 'windows'
priority | 2
route | 'internet'
sample | <Sample(1,'73fdfb85b80b81c87e78580dc5b46a73c73f7907f8e6cff0886dcb6493365255')>
sample_parent_id | None
self | <lib.cuckoo.core.database._Database object at 0x78d9fa166c80>
shrike_msg | None
shrike_refer | None
shrike_sid | None
shrike_url | None
source_url | False
static | False
tag | 'x86'
tag_name | 'x86'
tags | 'win10x64,x86'
tags_tasks | ''
task | <Task(1,'/tmp/cuckoo-tmp/upload_v3_dh2v4/SkypeMeetingsApp.msi')>
timeout | 200
tlp | None
user_id | 0
username | False
As you can see it adds x86 following this logic in database.py
if isinstance(obj, (File, PCAP, Static)):
fileobj = File(obj.file_path)
file_type = fileobj.get_type()
file_md5 = fileobj.get_md5()
# check if hash is known already
try:
with self.session.begin_nested():
sample = Sample(
md5=file_md5,
crc32=fileobj.get_crc32(),
sha1=fileobj.get_sha1(),
sha256=fileobj.get_sha256(),
sha512=fileobj.get_sha512(),
file_size=fileobj.get_size(),
file_type=file_type,
ssdeep=fileobj.get_ssdeep(),
parent=sample_parent_id,
source_url=source_url,
)
self.session.add(sample)
except IntegrityError:
sample = self.session.query(Sample).filter_by(md5=file_md5).first()
if DYNAMIC_ARCH_DETERMINATION:
# Assign architecture to task to fetch correct VM type
# This isn't 100% full proof
if "PE32+" in file_type or "64-bit" in file_type or package.endswith("_x64"):
if tags:
tags += ",x64"
else:
tags = "x64"
else:
if LINUX_ENABLED and platform == "linux":
linux_arch = _get_linux_vm_tag(file_type)
if linux_arch:
if tags:
tags += f",{linux_arch}"
else:
tags = linux_arch
else:
if tags:
tags += ",x86"
else:
tags = "x86"
task = Task(obj.file_path)
task.sample_id = sample.id
if isinstance(obj, (PCAP, Static)):
# since no VM will operate on this PCAP
task.started_on = datetime.now()
elif isinstance(obj, URL):
task = Task(obj.url)
tags = "x64"
else:
return None
this results in the sqlalchemy.exc.IntegrityError error that OP posted, this is the error log from the webserver :
[21/Jun/2024 12:50:02] "GET /submit/ HTTP/1.1" 200 50025
/opt/CAPEv2/web/../lib/cuckoo/core/database.py:1264: SAWarning: Object of type <Task> not in session, add operation along 'Tag.tasks' won't proceed
with self.session.begin_nested():
Internal Server Error: /submit/
Traceback (most recent call last):
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
dialect.do_execute(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "tasks_tags" violates foreign key constraint "tasks_tags_tag_id_fkey"
DETAIL: Key (tag_id)=(2) is not present in table "tags".
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/opt/CAPEv2/web/submission/views.py", line 403, in index
status, task_ids_tmp = download_file(**details)
File "/opt/CAPEv2/web/../lib/cuckoo/common/web_utils.py", line 837, in download_file
task_ids_new, extra_details = db.demux_sample_and_add_to_db(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1485, in demux_sample_and_add_to_db
task_id = self.add_path(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1333, in add_path
return self.add(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1264, in add
with self.session.begin_nested():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/util.py", line 146, in __exit__
with util.safe_reraise():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/util.py", line 144, in __exit__
self.commit()
File "<string>", line 2, in commit
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 136, in _go
ret_value = fn(self, *arg, **kw)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1218, in commit
self._prepare_impl()
File "<string>", line 2, in _prepare_impl
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 136, in _go
ret_value = fn(self, *arg, **kw)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1193, in _prepare_impl
self.session.flush()
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4142, in flush
self._flush(objects)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4277, in _flush
with util.safe_reraise():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4238, in _flush
flush_context.execute()
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
rec.execute(self)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 591, in execute
self.dependency_processor.process_saves(uow, states)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/dependency.py", line 1178, in process_saves
self._run_crud(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/dependency.py", line 1241, in _run_crud
connection.execute(statement, secondary_insert)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
return meth(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
return connection._execute_clauseelement(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
ret = self._execute_context(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1839, in _execute_context
return self._exec_insertmany_context(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2116, in _exec_insertmany_context
self._handle_dbapi_exception(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
dialect.do_execute(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "tasks_tags" violates foreign key constraint "tasks_tags_tag_id_fkey"
DETAIL: Key (tag_id)=(2) is not present in table "tags".
[SQL: INSERT INTO tasks_tags (task_id, tag_id) VALUES (%(task_id__0)s, %(tag_id__0)s), (%(task_id__1)s, %(tag_id__1)s)]
[parameters: {'task_id__0': 1, 'tag_id__0': 1, 'task_id__1': 1, 'tag_id__1': 2}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
ERROR:django.request:Internal Server Error: /submit/
Traceback (most recent call last):
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
dialect.do_execute(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "tasks_tags" violates foreign key constraint "tasks_tags_tag_id_fkey"
DETAIL: Key (tag_id)=(2) is not present in table "tags".
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/opt/CAPEv2/web/submission/views.py", line 403, in index
status, task_ids_tmp = download_file(**details)
File "/opt/CAPEv2/web/../lib/cuckoo/common/web_utils.py", line 837, in download_file
task_ids_new, extra_details = db.demux_sample_and_add_to_db(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1485, in demux_sample_and_add_to_db
task_id = self.add_path(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1333, in add_path
return self.add(
File "/opt/CAPEv2/web/../lib/cuckoo/core/database.py", line 1264, in add
with self.session.begin_nested():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/util.py", line 146, in __exit__
with util.safe_reraise():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/util.py", line 144, in __exit__
self.commit()
File "<string>", line 2, in commit
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 136, in _go
ret_value = fn(self, *arg, **kw)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1218, in commit
self._prepare_impl()
File "<string>", line 2, in _prepare_impl
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py", line 136, in _go
ret_value = fn(self, *arg, **kw)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1193, in _prepare_impl
self.session.flush()
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4142, in flush
self._flush(objects)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4277, in _flush
with util.safe_reraise():
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 4238, in _flush
flush_context.execute()
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
rec.execute(self)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 591, in execute
self.dependency_processor.process_saves(uow, states)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/dependency.py", line 1178, in process_saves
self._run_crud(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/dependency.py", line 1241, in _run_crud
connection.execute(statement, secondary_insert)
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
return meth(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
return connection._execute_clauseelement(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
ret = self._execute_context(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1839, in _execute_context
return self._exec_insertmany_context(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2116, in _exec_insertmany_context
self._handle_dbapi_exception(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
dialect.do_execute(
File "/home/cape/.cache/pypoetry/virtualenvs/capev2-t2x27zRb-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "tasks_tags" violates foreign key constraint "tasks_tags_tag_id_fkey"
DETAIL: Key (tag_id)=(2) is not present in table "tags".
[SQL: INSERT INTO tasks_tags (task_id, tag_id) VALUES (%(task_id__0)s, %(tag_id__0)s), (%(task_id__1)s, %(tag_id__1)s)]
[parameters: {'task_id__0': 1, 'tag_id__0': 1, 'task_id__1': 1, 'tag_id__1': 2}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
the problem is not related to machinery i guess even when trying to perform static analysis the same problem happen its not also related to cloud hosting im using local hosting
When if that happens in local you need to provide versions of db, sqlachemy etc as I have no issues
When if that happens in local you need to provide versions of db, sqlachemy etc as I have no issues
Could you elaborate on how this could be an issue relate to the db and sqlalchemy version, from my previous reply if you read the error log it seems like a logical issue inside the code. Anyway since I might (and hope) to be wrong these are the versions locally installed :
Since OP has the same issue I think that it's higly unlikely that it could be related to the psql or sqlalchemy version.
No, the problem is not directly related to azure. I found a way to bypass task tag relation in database.py by commenting out the section of the code in the
add()
function that deals with tags format :task.cape = cape task.tags_tasks = tags_tasks # Deal with tags format (i.e., foo,bar,baz) if tags: for tag in tags.split(","): tag_name = tag.strip() if tag_name and tag_name not in [tag.name for tag in task.tags]: #if tags: # for tag in tags.split(","): # tag_name = tag.strip() # if tag_name and tag_name not in [tag.name for tag in task.tags]: # "Task" object is being merged into a Session along the backref cascade path for relationship "Tag.tasks"; in SQLAlchemy 2.0, this reverse cascade will not take place. # Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag # (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) task.tags.append(self._get_or_create(Tag, name=tag_name)) # task.tags.append(self._get_or_create(Tag, name=tag_name)) if clock: if isinstance(clock, str):
but of course this is not a solution to the problem, I'm trying to understand if during the for loop the tag id gets wrongly incremented.
Could someone follow up with the actual expected output of this section ?
At the moment i removed the comments from the previous reply and commented these lines instead :
if DYNAMIC_ARCH_DETERMINATION:
# Assign architecture to task to fetch correct VM type
# This isn't 100% full proof
if "PE32+" in file_type or "64-bit" in file_type or package.endswith("_x64"):
if tags:
tags += ",x64"
else:
tags = "x64"
else:
if LINUX_ENABLED and platform == "linux":
linux_arch = _get_linux_vm_tag(file_type)
if linux_arch:
if tags:
tags += f",{linux_arch}"
else:
tags = linux_arch
#else:
# if tags:
# tags += ",x86"
# else:
# tags = "x86"
Now the tasks_tags relation works correctly. Any idea on how to fix that part of code without removing the section like I did ?
@leoiancu21 Thanks you for helping it worked for me
@leoiancu21 Thanks you for helping it worked for me
My pleasure, does your analysis work completely now ? I'm still stuck in the reporting procedure where the analysis gets stuck but i don't know if this is related to this issue
yes exactly
is this your case ??
@HUSMUS9999 not exactly but it could be I have to debug a bit in order to understand it , I don't think that this could be related tho. I think it would be better to open another issue with that specific problem since the actual fix for this issue is still not found and I don't want to mix problems
alright thanks @leoiancu21 do i have to close this issue ?
Unfortunately it's not fixed, we just used a workaround so i would leave it open so if someone finds a way to solve the problem can reply with a commit and close it
When if that happens in local you need to provide versions of db, sqlachemy etc as I have no issues
Could you elaborate on how this could be an issue relate to the db and sqlalchemy version, from my previous reply if you read the error log it seems like a logical issue inside the code. Anyway since I might (and hope) to be wrong these are the versions locally installed :
- PostgreSQL 16.3
- SQLAlchemy Version: 2.0.16
Since OP has the same issue I think that it's higly unlikely that it could be related to the psql or sqlalchemy version.
bcz i don't have issue with tags, and i don't have to modify nothing at all. and my amount of cape servers is huge
i have the same software version, just checked it
@tbeadle i just saw in log, maybe you can help here a bit more(i have fly so my mind is pretty off)
/opt/CAPEv2/web/../lib/cuckoo/core/database.py:1264: SAWarning: Object of type <Task> not in session, add operation along 'Tag.tasks' won't proceed
with self.session.begin_nested():
@HUSMUS9999 Could you please put the following in custom/conf/cuckoo.conf:
[database]
log_statements = on
restart cape-web and try the submission again. I'm interested in SQL statements issued. They should be in /var/log/django/access.log. The _get_or_create
call should be issuing a SELECT to see if the x86 Tag exists and, if it doesn't, it should create it at the start of the begin_nested
call when the Task object is added to the session so that it has an ID that can be used to add it to the tags_tasks table.
So far, I have not been able to reproduce the problem.
I have the same problem.
I changed the following settings and then checked the cape-web logs.
[database]
log_statements = on
As you can see in the log below, the tag was deleted immediately after the new tag was inserted.
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,248 INFO sqlalchemy.engine.Engine SELECT tags.id AS tags_id, tags.name AS tags_name
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.name = %(name_1)s
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: LIMIT %(param_1)s
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:SELECT tags.id AS tags_id, tags.name AS tags_name
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.name = %(name_1)s
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: LIMIT %(param_1)s
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,250 INFO sqlalchemy.engine.Engine [generated in 0.00126s] {'name_1': 'x86', 'param_1': 1}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:[generated in 0.00126s] {'name_1': 'x86', 'param_1': 1}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,259 INFO sqlalchemy.engine.Engine INSERT INTO tags (name) VALUES (%(name)s) RETURNING tags.id
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:INSERT INTO tags (name) VALUES (%(name)s) RETURNING tags.id
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,260 INFO sqlalchemy.engine.Engine [generated in 0.00082s] {'name': 'x86'}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:[generated in 0.00082s] {'name': 'x86'}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: /opt/CAPEv2/web/../lib/cuckoo/core/database.py:1264: SAWarning: Object of type <Task> not in session, add operation along 'Tag.tasks' won't proceed
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: with self.session.begin_nested():
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,265 INFO sqlalchemy.engine.Engine DELETE FROM tags WHERE NOT (EXISTS (SELECT 1
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM tasks, tasks_tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.id = tasks_tags.tag_id AND tasks.id = tasks_tags.task_id)) AND NOT (EXISTS (SELECT 1
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM machines, machines_tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.id = machines_tags.tag_id AND machines.id = machines_tags.machine_id))
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:DELETE FROM tags WHERE NOT (EXISTS (SELECT 1
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM tasks, tasks_tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.id = tasks_tags.tag_id AND tasks.id = tasks_tags.task_id)) AND NOT (EXISTS (SELECT 1
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: FROM machines, machines_tags
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: WHERE tags.id = machines_tags.tag_id AND machines.id = machines_tags.machine_id))
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,267 INFO sqlalchemy.engine.Engine [cached since 0.02688s ago] {}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:[cached since 0.02688s ago] {}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,270 INFO sqlalchemy.engine.Engine SAVEPOINT sa_savepoint_2
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:SAVEPOINT sa_savepoint_2
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: 2024-06-28 02:43:47,270 INFO sqlalchemy.engine.Engine [no key 0.00078s] {}
Jun 28 02:43:47 ubuntu2204.localdomain python3[26908]: INFO:sqlalchemy.engine.Engine:[no key 0.00078s] {}
I also commented out the following process in CAPEv2/lib/cuckoo/core/database.py and it now works correctly.
# There should be a better way to clean up orphans. This runs after every flush, which is crazy.
# @event.listens_for(self.session, "after_flush")
# def delete_tag_orphans(session, ctx):
# session.query(Tag).filter(~Tag.tasks.any()).filter(~Tag.machines.any()).delete(synchronize_session=False)
I suspect there is a problem with the above process for removing unused tags.
This appears to be due to a change in behavior between the version of SQLAlchemy that is required via pyproject.toml/poetry.lock or requirements.txt (1.4.50) and the version that you're running (2.0+). I was able to reproduce the error by running poetry run pip install SQLAlchemy==2.0.16
and then submitting a sample. I'm not sure how you installed CAPE, but if you use poetry install --sync
, it should install all the dependencies with the versions locked in poetry.lock. Please try this and let me know if that solves the problem.
Thank you. This problem was solved by downgrading from SQLAlchemy 2.0.31 to 1.4.50.
It was caused by running the following command according to the documentation.
https://capev2.readthedocs.io/en/latest/installation/host/installation.html#optional-dependencies
sudo -u cape poetry run pip install -r extra/optional_dependencies.txt
The above command installed flask-sqlalchemy and upgraded SQLAlchemy 1.4.50 to 2.0.31.
https://github.com/kevoreilly/CAPEv2/blob/master/extra/optional_dependencies.txt#L7C1-L7C17
ok commented that out, thanks Tommy for help
After installation success when i try to submit a binary for analysis this happen
IntegrityError
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "tasks_tags" violates foreign key constraint "tasks_tags_tag_id_fkey" DETAIL: Key (tag_id)=(14) is not present in table "tags".
[SQL: INSERT INTO tasks_tags (task_id, tag_id) VALUES (%(task_id)s, %(tag_id)s)] [parameters: {'task_id': 14, 'tag_id': 14}] (Background on this error at: https://sqlalche.me/e/20/gkpj) Traceback (most recent call last)
The debugger caught an exception in your WSGI application. You can now look at the traceback which led to the error.
To switch between the interactive traceback and the plaintext one, you can click on the "Traceback" headline. From the text traceback you can also create a paste of it. For code execution mouse-over the frame you want to debug and click on the console icon on the right side.
You can execute arbitrary Python code in the stack frames and there are some extra helpers available for introspection: