lunasec-io / lunasec

LunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTrace GitHub App: https://github.com/marketplace/lunatrace-by-lunasec/
https://www.lunasec.io/
Other
1.44k stars 164 forks source link

Package reference embeddings #1151

Open breadchris opened 1 year ago

breadchris commented 1 year ago

Add package reference embeddings to the database and generate them from the CLI. This will let us be able to do a semantic search of package readmes.

github-actions[bot] commented 1 year ago

Hasura Semantic Diff

Hasura config files have changed. This comment shows which fields have changed ignoring formatting.

Click to expand! ``` (root level) + two map entries added: table: name: content_embedding schema: package object_relationships: - name: reference_content using: foreign_key_constraint_on: reference_content_id array_relationships + one list entry added: - name: reference_contents using: foreign_key_constraint_on: column: package_id table: name: reference_content schema: package (root level) + three map entries added: table: name: reference_content schema: package object_relationships: - name: package using: foreign_key_constraint_on: package_id array_relationships: - name: content_embeddings using: foreign_key_constraint_on: column: reference_content_id table: name: content_embedding schema: package diff --git a/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/down.sql b/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/down.sql new file mode 100644 index 00000000..504ab7e0 --- /dev/null +++ b/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/down.sql @@ -0,0 +1,2 @@ +DROP TABLE "package"."content_embedding"; +DROP TABLE "package"."reference_content"; diff --git a/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/up.sql b/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/up.sql new file mode 100644 index 00000000..48c39ca7 --- /dev/null +++ b/lunatrace/bsl/hasura/migrations/lunatrace/1677850286590_package_reference_embeddings/up.sql @@ -0,0 +1,25 @@ +CREATE TABLE "package"."reference_content" ( + "id" uuid NOT NULL DEFAULT gen_random_uuid(), + "package_id" uuid NOT NULL REFERENCES "package"."package"("id") ON UPDATE cascade ON DELETE cascade, + "url" text NOT NULL, + "content" text NOT NULL, + "normalized_content" text NOT NULL, + "content_type" text NOT NULL, + "last_successful_fetch" timestamptz DEFAULT NULL, + PRIMARY KEY ("id"), + UNIQUE ("package_id", "url") +); + +CREATE TABLE "package"."content_embedding" ( + "id" uuid NOT NULL DEFAULT gen_random_uuid(), + "content_hash" text NOT NULL, + "reference_content_id" uuid NOT NULL REFERENCES "package"."reference_content"("id") ON UPDATE cascade ON DELETE cascade, + "content" text NOT NULL, + "embedding" vector (1536) NOT NULL, + PRIMARY KEY ("id"), + UNIQUE ("content_hash") +); + +CREATE INDEX ON "package"."content_embedding" + USING ivfflat (embedding vector_cosine_ops) + WITH (lists = 100); ```