docintelapp / DocIntel

Open Source Platform for storing, organizing, and searching documents related to cyber threats
https://docintel.org
Other
149 stars 24 forks source link

Get observables failed sometime #108

Open ludoComp9 opened 1 month ago

ludoComp9 commented 1 month ago

Hello,

From a DocIntel 2.1.2 instance running on Ubuntu 20.4 VM, I defined some facets (for CVE, threat actor names...) and upload files through APIs. Facets defined:

  tlp:
    title: 'Traffic Light Protocol'
    description: 'Matches common TLP notations'
    extraction_regex: 'TLP[\s_\-\:](white|clear|green|amber|amber+strict|red)'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  pap:
    title: 'Permissible Actions Protocol'
    description: 'Matches common PAP notations'
    extraction_regex: 'PAP[\s_\-\:](white|clear|green|amber|amber+strict|red)'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  vulnerabilities:
    title: 'Vulnerabilities'
    description: 'Matches CVE identifiers'
    extraction_regex: 'CVE-\d{4}-\d{4,7}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.mandiant:
    title: 'Threat Actor group names (Mandiant definition)'
    description: 'Matches Mandiant group names'
    extraction_regex: '(APT|FIN|UNC)\d{1,}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.microsoft:
    title: 'Threat Actor group names (Microsoft definition)'
    description: 'Matches Microsoft Threat actor names'
    extraction_regex: '(STORM|DEV)-\d{4,}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.360:
    title: 'Threat Actor 360.net group names'
    description: 'Matches 360.net group names'
    extraction_regex: 'APT-C-\d{1,4}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.proofpoint:
    title: 'Threat Actor proofpoint group names'
    description: 'Matches proofpoint group names'
    extraction_regex: 'TA\d{3,}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.cert-ua:
    title: 'Threat Actor CERT-UA group names'
    description: 'Matches CERT-UA group names'
    extraction_regex: 'UAC-\d{4,}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  actor.common:
    title: 'Threat Actor group names'
    description: 'Matches Threat actor names'
    extraction_regex: '\w{3,}\s?(spider|panda|bear|kitten|sphinx|tiger|chollima)'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  attack.groups:
    title: 'Mitre ATT&CK threat actor group IDs'
    description: 'Matches Mitre ATT&CK threat actor group identifiers'
    extraction_regex: 'G\d{4}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  attack.techniques:
    title: 'Mitre ATT&CK techniques'
    description: 'Matches ATT&CK techniques & sub-techniques'
    extraction_regex: '(T[0-9]{4}(\.[0-9]{3}?)'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase
  malware:
    title: 'Malware'
    description: 'A facet to add common malware names as tags'
    extraction_regex:
    mandatory: false
    auto_extract: true
    tag_normalization: upcase

Sometime it works but often it fails and I can see following error messages from logs of docintel-dev-document-analyzer container:

INSERT INTO "DocumentTag" ("DocumentId", "TagId")
VALUES (@p226, @p227);
2024-05-28 11:33:38.4374 [ERROR] [Microsoft.EntityFrameworkCore.Update] An exception occurred in the database while saving changes for context type 'DocIntel.Core.Models.DocIntelContext'.
Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while saving the entity changes. See the inner exception for details.
 ---> Npgsql.PostgresException (0x80004005): 23505: duplicate key value violates unique constraint "IX_Tags_FacetId_Label"

DETAIL: Detail redacted as it may contain sensitive data. Specify 'Include Error Detail' in the connection string to include this information.
   at Npgsql.Internal.NpgsqlConnector.<ReadMessage>g__ReadMessageLong|234_0(NpgsqlConnector connector, Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
  Exception data:
    Severity: ERROR
    SqlState: 23505
    MessageText: duplicate key value violates unique constraint "IX_Tags_FacetId_Label"
    Detail: Detail redacted as it may contain sensitive data. Specify 'Include Error Detail' in the connection string to include this information.
    SchemaName: public
    TableName: Tags
    ConstraintName: IX_Tags_FacetId_Label
    File: nbtinsert.c
    Line: 666
    Routine: _bt_check_unique
   --- End of inner exception stack trace ---
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
2024-05-28 11:33:38.4374 [ERROR] [DocIntel.Core.Utils.DocumentAnalyzerUtility] Document 4981bd7b-c880-4da9-b4ae-8910005828a0 could not be analyzed (An error occurred while saving the entity changes. See the inner exception for details.)

Any idea ?

Regards,

ludoComp9 commented 1 month ago

Hello,

After reinstall, I defined only one facet:

actor.mandiant:
    title: 'Threat Actor group names (Mandiant definition)'
    description: 'Matches Mandiant group names'
    extraction_regex: '(APT|FIN|UNC)\d{1,}'
    mandatory: false
    auto_extract: true
    tag_normalization: upcase

and import CERT-EU reports via API. I observed same issue: