BeanieODM / beanie

Asynchronous Python ODM for MongoDB
http://beanie-odm.dev/
Apache License 2.0
2.04k stars 215 forks source link

[BUG] beanie iterative migration with multiple migrations fails with TransientTransactionError when there are a lot of documents #727

Closed prabhumarappan closed 6 months ago

prabhumarappan commented 1 year ago

Describe the bug When running a migration file (which consists of multiple migrations consisting of 210k documents) raised the following error: Screenshot 2023-10-02 at 15 12 01

And the changes from the first migration were already committed even though the second migration raises error.

I got the same error after reducing the number of documents to 180k. But, when it went down to 165k, The migration succeeded successfully.

Expected behavior

  1. None of the migrations should have been committed since there is an error (1st migration changes are applied even if 2nd migration raises an error).
  2. Even if the migration takes more than 2 minutes, it should go through irrespective of the 120 second transaction limit

Also, since the transaction timeout is defaulted to 2 minutes. I think there should be a way to pass in the transaction timeout max_commit_time_ms as a param before starting a migration.

roman-right commented 1 year ago

Good catch! Thank you. I'll fix it this-next week

harris commented 1 year ago

@roman-right any updates on this bad boy? Our process right now requires us to take a snapshot of the db before we run any migration because we never know if it will fail in the middle or not.

roman-right commented 1 year ago

Hi @harris, Unfortunately, I haven't had time in the past few weeks. It will be addressed during the next bug-fixing session. PRs are always welcome.

roman-right commented 10 months ago

Hi @prabhumarappan , Could you please provide a reproducible example? I didn't catch it on my side. If it needs very big amount of data or specific docker configuration, please provide them as well. Thank you!

zakajd commented 10 months ago

+1 on this issue. I have relatively small collection (<10k documents), but wanted to do a large migration that would split / create a bunch of new documents. Got same error as prabhumarappan

pymongo.errors.OperationFailure: Transaction with { txnNumber: 1 } has been aborted., full error: 
{'errorLabels': ['TransientTransactionError'], 'ok': 0.0, 
'errmsg': 'Transaction with { txnNumber: 1 } has been aborted.', 'code': 251, 
'codeName': 'NoSuchTransaction', '$clusterTime': 
{'clusterTime': Timestamp(1701914803, 1), '
signature': {'hash': b'\x87\x8f*spF\xf1\xdd\x8c\x9f?M\xa0c\xa5\xa1_h\x8ee', 'keyId': 7276087405910687746}}, 'operationTime': Timestamp(1701914803, 1)
}
github-actions[bot] commented 9 months ago

This issue is stale because it has been open 30 days with no activity.

roman-right commented 9 months ago

Probably this PR can help - https://github.com/roman-right/beanie/pull/828 @zakajd , @harris , @prabhumarappan , could you please try?

mmabrouk commented 9 months ago

@roman-right, we encountered the same problem during a migration. We attempted to resolve it by setting a large max_commit_time_ms value manually in the code to allow the transaction to run for more than two minutes. However, this didn't solve the issue. We received the following error:

pymongo.errors.OperationFailure: Transaction with { txnNumber: 1 } has been aborted., full error: {'errorLabels': ['TransientTransactionError'], 'ok': 0.0, 'errmsg': 'Transaction with { txnNumber: 1 } has been aborted.', 'code': 251, 'codeName': 'NoSuchTransaction', '$clusterTime': {'clusterTime': Timestamp(1705515449, 1),

We managed to fix the problem only by using the no-transaction option.

Do you have any ideas as to why this problem might have arisen?

github-actions[bot] commented 8 months ago

This issue is stale because it has been open 30 days with no activity.

prabhumarappan commented 8 months ago

Probably this PR can help - #828 @zakajd , @harris , @prabhumarappan , could you please try?

@roman-right this does not exactly fix the issue. the #828 PR, just enables a no-transaction option through the command line. We kind of did the same thing but we just directly disabled transactions in the code for migrations for us to run

so, the issue is still there

github-actions[bot] commented 7 months ago

This issue is stale because it has been open 30 days with no activity.

github-actions[bot] commented 6 months ago

This issue was closed because it has been stalled for 14 days with no activity.