Version 5: ACID-Transactions and Typescript

Uzlopak commented 2 years ago

@koresar I open this PR, so that others can also do a code review :)

Uzlopak commented 2 years ago

@clintonb

No this is work of 5 days and night. I will not rip this apart. I simply dont have the energy to do it. Also I am very confident in the code.

Also despite that it shows that 45 files were changed, only only 13 are active code files. The rest are configuration files, unit tests, interfaces etc.. So if you only focus on the 13 files in the src folder, you can manage it. Also keep in mind, that you have 100% test coverage. You can e.g. go to book.spec.ts, make .only for the book-describe, than e.g. select balance and do a npm run test:coverage and check if the code is covered by 100% or not. If 100% than check the unit tests if they are really complete and do make sense or not. Or start with the helper methods.

Please clone the repo and read the code with your favorite IDE. This would already help alot. :)

clintonb commented 2 years ago

It’s nice you’ve done this work, but I’m not reviewing it. This isn’t my project, and the scope of the pull request is simply too large.

You already have separate commits. I highly recommend breaking this into multiple pull requests to make the work easier to review. As a maintainer of other open source projects, it’s far easier/faster for me to review small, focused, pull requests than giant ones like this. 100% test coverage does not mean 100% bug-free.

Uzlopak commented 2 years ago

@clintonb I know what you mean. In OSS you have alot of Contributors and it is easier to review one small PR than a whole rewrite. But my experience as a professional Software Developery is that you have sometimes get out of the comfort zone and do the hard work and research to get the best result. :) Also this package is fairly simple.

And breaking it to multiple PRs is not an option, so I have to decline your suggestion. I mean, come on... these are basically 13 files. I have regularly PRs with more than hundred file changes... Not ideal, but you get used to it. And having a good code coverage and a good test structure helps you to review each file in a fair amount of time. If you of course do a unit test which does everything, than yeah you get 100% coverage fairly easy. But that garbage gets rejected by me also regularly.

Best Regards :)

Uzlopak commented 2 years ago

@koresar

Good morning to Australia. Medici should have now an awesome status. Only the pagination in balance is kind of inconsistent. Maybe you can tell if and what I should do with it? Keep it as is? Rip out Pagination? Fix Pagination?

Other than that, my PR is so far ready. I dont know what else you can do to improve the performance.

Uzlopak commented 2 years ago

The current Changelog

The project was rewritten with TypeScript. Types are provided within the package now.
Added support for MongoDB sessions (aka ACID transactions). See IOptions type.
Added a mongoTransaction-method, which is a convenience shortcut for mongoose.connection.transaction.
Added async helper method initModels, which initializes the underlying transactionModel and journalModel. Use this after you connected to the MongoDB-Server if you want to use transactions. Or else you could get Unable to read from a snapshot due to pending collection catalog changes; please retry the operation.-Error when acquiring a session because the actual database-collection is still being created by the underlying mongoose-instance.
BREAKING: Node.js 12 is the lowest supported version. Although, 10 should still work fine, when using mongoose v5.
BREAKING: You can't import book anymore. Only Book is supported. require("medici").Book. Mongoose v6 is the only supported version now. Avoid using both v5 and v6 in the same project.
MongoDB 4 and above is supported. You can still use MongoDB 3, but ACID-sessions could have issues.
BREAKING: You can't import book anymore. Only Book is supported. require("medici").Book.
Added a new index on the transactionModel to improve the performance of paginated ledger queries.
BREAKING: .ledger() returns lean Transaction-Objects for better performance. To retrieve hydrated Transaction-Objects, set lean to false in the third parameter of .ledger(). It is recommended to not hydrate the transactions, as it implies that the transactions could be manipulated and the data integrity of Medici could be risked.
You can now specify the precision. Book now accepts an optional second parameter, where you can set the precision used internally by Medici. Default value is 7 digits precision. Javascript has issues with floating points precision and can only handle 16 digits precision, like 0.1 + 0.2 results in 0.30000000000000004 and not 0.3. The default precision of 7 digits after decimal, results in the correct result of 0.1 + 0.2 = 0.3. The default value is taken from medici version 4.0.2. Be careful, if you use currency, which has more decimal points, e.g. Bitcoin has a precision of 8 digits after the comma. So for Bitcoin you should set the precision to 8. You can enforce an "only-Integer"-mode, by setting the precision to 0. But keep in mind that Javascript has a max safe integer limit of 9007199254740991.
Added maxAccountPath. You can set the maximum amount of account paths via the second parameter of Book. This can improve the performance of .balance() and .ledger() calls as it will then use the accounts attribute of the transactions as a filter.
BREAKING: Added validation for name of Book, maxAccountPath and precision. A name has to be not an empty string or a string containing only whitespace characters. precision has to be an integer bigger or equal 0. maxAccountPath has to be an integer bigger or equal 0.
Added setJournalSchema and setTransactionSchema to use custom Schemas. It will ensure, that all relevant middlewares and methods are also added when using custom Schemas. Use syncIndexes-method from medici after setTransactionSchema to enforce the defined indexes on the models.
BREAKING: Added prototype-pollution protection when creating entries. Reserved words like __proto__ can not be used as properties of a Transaction or a Journal or their meta-Field and they will get silently filtered.
BREAKING: When calling book.void() the provided journal_id has to belong to the book. If the journal does not exist within the book, medici will throw a JournalNotFoundError. In medici < 5 you could theoretically void a journal of another book.
Added a lockModel to make it possible to call .balance() and get a reliable result while using a mongo-session. Call .lockAccounts() with first parameter being an Array of Accounts, which you want to lock. E.g. book.lockAccounts(["Assets:User:User1"], { session }). For best performance call lockAccounts as the last operation in the transaction.

Uzlopak commented 2 years ago

@koresar If you want to remove paginated balance #36

Uzlopak commented 2 years ago

@nicolasburtey Implemented a lockAccounts method on book, which should be called as last operation in the transaction. This improves the performance of transactions in my unit test drastically. Before it had about 700 ms duration. Now it has about 200 ms duration. I took this idea from MongoDB Performance Tuning (2021), p. 217

If we move the contentious statement to the end of the transaction, then the chance
of a TransientTransactionError will be reduced, since the window for conflict will be
reduced to the final few moments in the execution of the transaction.

@koresar + @nicolasburtey

I am not that happy with the name .lockAccounts(). Do you have any suggestions?

But yeah, I guess the performance improvement is not to neglect. :)

Uzlopak commented 2 years ago

@koresar I am gone for 10 days. I hope the code is good to merge. Looking forward

koresar commented 2 years ago

Hi. I'm going through changes commit by commit. I'm loving the fact you are doing small commits rather than pushing huge ball of code in one go!

Random thoughts in no order:

We can make the default precision as 8. It won't break anything for anybody if me move from 7 to 8. (Trust me, I did it couple of times by now.)
Instead of for loop I'd recommend using for-of. It is as fast as for loop nowadays AFAIK.

export function extractObjectIdKeysFromSchema(schema: Schema) {
  const result: Set<string> = new Set();
  for (const [key, value] of Object.entries(schema.paths)) {
    if (value instanceof Schema.Types.ObjectId) {
      result.add(key);
    }
  }
  return result;
}

I am loving all the features, fixed, hardening, etc you did. The index cardinality got me excited!
Some of the changes labelled as "BREAKING" in README are more like "POTENTIALLY BREAKING". Meaning that it would not break most people code. E.g. void hardening, proto pollution protection, etc.

IMPORTANT!

The typescript-migration branch contains lock "account" feature code. That's no good. This PR is too much to merge. Could you please contact me in twitter DM?

Uzlopak commented 2 years ago

Regarding the for loop: I prefer the Classic for loops as they were always the fastest. Actually using for(var i = 0, il = arr.length ... Is the fastest loop and not using let. Also with the new stuff it could be that the memory consumption is bigger, as it potentially creates new Objects and Variables. But I guess we could easily argue that not the for loop is the bottleneck but the database operations.

So I am totally Ok if you want to use for-of instead of classical for-loop.

koresar commented 2 years ago

I'd change it to for-of because code is mostly read than written. We should write for readability. The current for loop usage is premature optimisaiton IMO.

koresar commented 2 years ago

We'll continue development in the next branch. Without any PRs for now. I'll publish v5.0.0-next to npm.

flash-oss / medici

Version 5: ACID-Transactions and Typescript #35