orientechnologies / orientdb-labs

OrientDB Labs hosts last development version of OrientDB.
Apache License 2.0
17 stars 3 forks source link

[OEP 18] Community Requirements For 3.1 #19

Open tglman opened 6 years ago

tglman commented 6 years ago

Summary: List and discussion of the requirements of the 3.1 from the cummunity.

Goals: Collect and prioritize the requirements for 3.1

Non-Goals:

Success metrics: Shared list of 3.1 requirements

Motivation:

Description: As today from the various discussion are emerged a not yet sorted list of requirements:

Alternatives:

Risks and assumptions:

Impact matrix

andrii0lomakin commented 6 years ago

Hi guys. As usual, I do not think that is requirements for 3.1 IMHO better to say that is just discussion of priority of issues some will be implemented in time for 3.1 some will be moved in another version. Otherwise, that just will be another forever running release. So about priorities. On a high level, I want to:

  1. Migrate to the key level, record level locks on transaction level.
  2. Use page locking instead of component locking at least for the most popular index, B-Tree index.
  3. Decrease the WAL write overhead and as result increase write speed of other components.
  4. Implement fully durability mode in our transactions.
  5. Make our storage engine more lightweight in general.

That is high level, not sure that all will be done at 3.1.

  1. Let's look what can be done in short-term in with big impact. Currently, I am working on immutable WAL. And this change must have to implement fully durability mode. The problem is following. When we write new records from the WAL to disk after the flush we if the page is not fully written, we read this page and write additional data in this page. This breaks the main invariant of our durability framework, if data is written to the page they have to be in WAL. Surely changing the strategy of writes in WAL will change both calculations of LSN and strategy of writing the data to the disk. The interesting side effect is that new WAL is mostly based on CAS operations and as result should be more scalable.
  2. The next thing which is really fast to implement because of we already have it implemented with some modifications is lock-free read cache for our disk data. We have a separate issue #8 for that. It will take about two weeks and change will automatically include support of small pages inside of the read cache.
  3. Two issues above are very quick to implement. Next feature is longer but it unlocks big possibilities. Physiological logging. It allows to: a) decrease WAL overhead. b) make storage engine much more lightweight by removing tracking of page changes. c) Make requirements for the locks are needed to implement durability weaker. d) implement full durability mode with overhead comparable with RDBMs overhead.
  4. Once physiological logging will be implemented next logical step is migrate to key/record level locking on a transactional level.
  5. Next step is the implementation of full durability mode.
  6. And final step which I see is the implementation of B-Link tree index which allows implementing range indexes using page locks instead of component locks.
  7. Yes and of course also issue with small pages can be solved just after implementation of lock-free read cache.

I suppose for a list of short-term and not so short-term tasks.

andrii0lomakin commented 6 years ago

I forgot to add also the implementation of support of files which are not covered by WAL and temporary files. But because currently a lot of emphases is done on testing of new components to keep storage engine very stable, I suppose it can be done in gaps when tests for other features are running. And small note. Even if performance before and after physiological logging change will be the same, I will consider this as a win because it will unlock us the possibility to increase system scalability. But I have significant doubts that performance will be the same.

andrii0lomakin commented 6 years ago

Ok, after discussion I suppose the shortest list for 3.1 issues #8 and #9 I will change their versions accordingly.