hasura / graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
https://hasura.io
Apache License 2.0
31.1k stars 2.77k forks source link

Postgres Ingestion Question #6340

Closed its-a-feature closed 3 years ago

its-a-feature commented 3 years ago

I have an already created Postgres database instance that I'm pointing a new Hasura docker container to. It's successfully reading and ingesting data, but there's one odd quirk that's preventing me from automatically importing everything, and I can't find any documentation on it.

I have a table called payload that is defined as the following:

class Payload(p.Model):
    # this is actually a sha256 from other information about the payload
    uuid = p.TextField(unique=True, null=False)
    # tag a payload with information like spearphish, custom bypass, lat mov, etc (indicates "how")
    tag = p.TextField(null=True)
    # creator of the payload, cannot be null! must be attributed to somebody (indicates "who")
    operator = p.ForeignKeyField(Operator, null=False)
    creation_time = p.DateTimeField(default=datetime.datetime.utcnow, null=False)
    # this is fine because this is an instance of a payload, so it's tied to one PayloadType
    payload_type = p.ForeignKeyField(PayloadType, null=False)
    # this will signify if a current callback made / spawned a new callback that's checking in
    #   this helps track how we're getting callbacks (which payloads/tags/parents/operators)
    pcallback = p.DeferredForeignKey("Callback", null=True)
    operation = p.ForeignKeyField(Operation, null=False)
    wrapped_payload = p.ForeignKeyField("self", null=True)
    deleted = p.BooleanField(null=False, default=False)
    # if the payload is in the build process: building, success, error
    build_container = p.TextField(null=False)
    build_phase = p.TextField(null=False, default="building")
    # capture error or any other info
    build_message = p.TextField(null=False, default="")
    # if there is a slack webhook for the operation, decide if this payload should generate an alert or not
    callback_alert = p.BooleanField(null=False, default=True)
    # when dealing with auto-generated payloads for lateral movement or spawning new callbacks
    auto_generated = p.BooleanField(null=False, default=False)
    task = p.DeferredForeignKey("Task", null=True)
    file = p.DeferredForeignKey("FileMeta", null=True)

The main thing to look at here is: wrapped_payload = p.ForeignKeyField("self", null=True) - a field in this table that references another Payload row. When the resulting database is imported by Hasura, I get the following relationships:

operator_id → operator . id - payload_operator_id_fkey
payload_type_id → payloadtype . id - payload_payload_type_id_fkey
operation_id → operation . id - payload_operation_id_fkey
wrapped_payload_id → payload . id - payload_wrapped_payload_id_fkey
wrapped_payload_id → payload . id - fk_payload_wrapped_payload_id_refs_payload
pcallback_id → callback . id - fk_payload_pcallback_id_refs_callback
task_id → task . id - fk_payload_task_id_refs_task
file_id → filemeta . id - fk_payload_file_id_refs_filemeta

I can't find out why there are two instance of wrapped_payload_id or what the difference is between them. Because two are created, I can't automatically import and have to delete one for the relationships to be added. I can't determine the difference or meaning behind the foreign relationship that ends in fkey or the one that starts with fk and ends with a refs. What's confusing me is I have another table, below that works just fine:

class FileBrowserObj(p.Model):
    task = p.ForeignKeyField(Task, null=False)
    timestamp = p.DateTimeField(null=False, default=datetime.datetime.utcnow)
    operation = p.ForeignKeyField(Operation, null=False)
    # this should be the fqdn of the host the info is from
    host = p.BlobField(null=False)
    permissions = p.TextField(null=False, default="")
    # this is the name of this file/folder
    name = p.BlobField(null=False)
    # this is the parent object
    parent = p.ForeignKeyField('self', null=True)
    # this is the full path for the parent folder
    # we need this to enable faster searching and better context
    parent_path = p.BlobField(null=False, default="")
    full_path = p.BlobField(null=False, default="")
    access_time = p.TextField(null=False, default="")
    modify_time = p.TextField(null=False, default="")
    comment = p.TextField(null=False, default="")
    # this is how we differentiate between files and folders of information
    is_file = p.BooleanField(null=False, default=False)
    size = p.TextField(null=False, default="")
    # indicates if we successfully pulled info about the object. False would be access denied for example
    success = p.BooleanField(null=True)
    deleted = p.BooleanField(null=False, default=False)

In this case, I also have parent = p.ForeignKeyField('self', null=True) - a field that references an instance of the same table. In this case though, upon parsing the database, only the following entry is created:

operation_id → operation . id - filebrowserobj_operation_id_fkey
task_id → task . id - filebrowserobj_task_id_fkey
parent_id → filebrowserobj . id - filebrowserobj_parent_id_fkey

So, what's the difference between the two examples?

tirumaraiselvan commented 3 years ago

Can you check your database for what all foreign keys are present on your Payload table. Many ways to do this, one way using psql : \d+ tablename

These relationships are being inferred from the underlying foreign keys in the database.

its-a-feature commented 3 years ago
                                                                Table "public.payload"
       Column       |            Type             | Collation | Nullable |               Default               | Storage  | Stats target | Description 
--------------------+-----------------------------+-----------+----------+-------------------------------------+----------+--------------+-------------
 id                 | integer                     |           | not null | nextval('payload_id_seq'::regclass) | plain    |              | 
 uuid               | text                        |           | not null |                                     | extended |              | 
 tag                | text                        |           |          |                                     | extended |              | 
 operator_id        | integer                     |           | not null |                                     | plain    |              | 
 creation_time      | timestamp without time zone |           | not null |                                     | plain    |              | 
 payload_type_id    | integer                     |           | not null |                                     | plain    |              | 
 operation_id       | integer                     |           | not null |                                     | plain    |              | 
 wrapped_payload_id | integer                     |           |          |                                     | plain    |              | 
 deleted            | boolean                     |           | not null |                                     | plain    |              | 
 build_container    | text                        |           | not null |                                     | extended |              | 
 build_phase        | text                        |           | not null |                                     | extended |              | 
 build_message      | text                        |           | not null |                                     | extended |              | 
 callback_alert     | boolean                     |           | not null |                                     | plain    |              | 
 auto_generated     | boolean                     |           | not null |                                     | plain    |              | 
 pcallback_id       | integer                     |           |          |                                     | plain    |              | 
 task_id            | integer                     |           |          |                                     | plain    |              | 
 file_id            | integer                     |           |          |                                     | plain    |              | 
Indexes:
    "payload_pkey" PRIMARY KEY, btree (id)
    "payload_file_id" btree (file_id)
    "payload_operation_id" btree (operation_id)
    "payload_operator_id" btree (operator_id)
    "payload_payload_type_id" btree (payload_type_id)
    "payload_pcallback_id" btree (pcallback_id)
    "payload_task_id" btree (task_id)
    "payload_uuid" UNIQUE, btree (uuid)
    "payload_wrapped_payload_id" btree (wrapped_payload_id)
Foreign-key constraints:
    "fk_payload_file_id_refs_filemeta" FOREIGN KEY (file_id) REFERENCES filemeta(id)
    "fk_payload_pcallback_id_refs_callback" FOREIGN KEY (pcallback_id) REFERENCES callback(id)
    "fk_payload_task_id_refs_task" FOREIGN KEY (task_id) REFERENCES task(id)
    "fk_payload_wrapped_payload_id_refs_payload" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)
    "payload_operation_id_fkey" FOREIGN KEY (operation_id) REFERENCES operation(id)
    "payload_operator_id_fkey" FOREIGN KEY (operator_id) REFERENCES operator(id)
    "payload_payload_type_id_fkey" FOREIGN KEY (payload_type_id) REFERENCES payloadtype(id)
    "payload_wrapped_payload_id_fkey" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)
Referenced by:
    TABLE "buildparameterinstance" CONSTRAINT "buildparameterinstance_payload_id_fkey" FOREIGN KEY (payload_id) REFERENCES payload(id)
    TABLE "c2profileparametersinstance" CONSTRAINT "c2profileparametersinstance_payload_id_fkey" FOREIGN KEY (payload_id) REFERENCES payload(id)
    TABLE "callback" CONSTRAINT "callback_registered_payload_id_fkey" FOREIGN KEY (registered_payload_id) REFERENCES payload(id)
    TABLE "payload" CONSTRAINT "fk_payload_wrapped_payload_id_refs_payload" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)
    TABLE "payload" CONSTRAINT "payload_wrapped_payload_id_fkey" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)
    TABLE "payloadc2profiles" CONSTRAINT "payloadc2profiles_payload_id_fkey" FOREIGN KEY (payload_id) REFERENCES payload(id)
    TABLE "payloadcommand" CONSTRAINT "payloadcommand_payload_id_fkey" FOREIGN KEY (payload_id) REFERENCES payload(id)
    TABLE "payloadonhost" CONSTRAINT "payloadonhost_payload_id_fkey" FOREIGN KEY (payload_id) REFERENCES payload(id)
Triggers:
    newpayload_trigger AFTER INSERT ON payload FOR EACH ROW EXECUTE FUNCTION notify_newpayload()
    updatedpayload_trigger AFTER UPDATE ON payload FOR EACH ROW EXECUTE FUNCTION notify_updatedpayload()
its-a-feature commented 3 years ago

In there I also see the two offending lines:

"payload_wrapped_payload_id_fkey" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)
"fk_payload_wrapped_payload_id_refs_payload" FOREIGN KEY (wrapped_payload_id) REFERENCES payload(id)

But what's causing it to appear twice and what's the difference? Half of those foreign-key constraints are fk_*_refs_* and half are *_id_fkey but I don't see a rhyme or reason what makes the determination

tirumaraiselvan commented 3 years ago

But what's causing it to appear twice and what's the difference? Half of those foreign-key constraints are fk_refs and half are *_id_fkey but I don't see a rhyme or reason what makes the determination

This was already there in the database (i.e. Hasura wouldn't know why it exists). You should check the method by which the table was created initially.

its-a-feature commented 3 years ago

I think i figured it out. Apparently postgres is ok with things being double defined for relationships and doesn't complain. So something was double-defining that relationship which i guess didn't matter with how I was using it before, but when hasura imported and saw the double define, it barfed.

Thanks!