Methuselah96 / immutable-js

Immutable persistent data collections for Javascript which increase efficiency and simplicity.
http://facebook.github.io/immutable-js/
MIT License
0 stars 2 forks source link

Feature request: Priority queue / self-sorting list [can PR] #4

Open Methuselah96 opened 4 years ago

Methuselah96 commented 4 years ago

Migrated from don't! Originally created by @mcclure on Thu, 08 Oct 2020 05:15:11 GMT


Hi, I have a need for a "priority queue" style data structure, or more specifically I need an immutable structure that I can add items to in arbitrary order and then repeatedly efficiently iterate over according to some fixed property (for example if I have a list of objects obj, I might want to keep it sorted by the key obj.date).

I need this enough I plan to attempt an implementation myself. I would be happy to write a PR if there is a specific way I could write it that you would be willing to accept.

I find a number of algorithms online for immutable priority queues, but it seems like it might be a good idea to base my implementation on the existing list structure rather than implementing a totally new structure. For this reason I am curious if there is any "contributor documentation" that would help me understand the underlying principles of immutable.js List. I am pawing through the source but not having much luck understanding where the important parts are. List.js seems to be describing something like a linked list but each node has this "array" member rather than a next reference and I'm not sure what "array" contains.

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Fri, 09 Oct 2020 15:56:29 GMT


(One other small thing: an alternative to using the "key function" would be make this a key/value store and sort on the key. This would result in the same implementation but a slightly different external interface. I think this approach is a little weird because you can have multiple values sharing a single key, but I don't have a strong opinion either way about it.)

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Mon, 12 Oct 2020 22:52:49 GMT


Made a StackOverflow question about my difficulties running npm run test / npm run perf: https://stackoverflow.com/questions/64325833/how-to-run-perf-tests-automated-tests-in-an-immutable-js-repo

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Tue, 13 Oct 2020 05:34:26 GMT


I've started on an implementation of this, feature branch is here https://github.com/mcclure/immutable-js/tree/pq

This adds an exported SortedList and isSortedList. SortedList has a complete implementation of one function, add(). If you call it, it crashes. (The implementation of "add" turned out to be more complicated than I thought! I don't actually know whether my approach is actually more efficient than just using a balanced tree or something.)

Will make an initial PR once I have add, pop and shift working.

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Sat, 10 Oct 2020 16:30:31 GMT


Also, what are the "@@transducer/" functions for? I do not find any description of them in the docs.

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Sat, 10 Oct 2020 15:31:06 GMT


ownerID in List is only needed for the implementation of mutable mode, right?

Methuselah96 commented 4 years ago

Migrated from don't reference! Original comment by @mcclure on Fri, 09 Oct 2020 15:54:00 GMT


Got some help on Twitter, here's the best I can understand: List is an "Vector Trie" apparently patterned on the Scala vector type, meaning it's a tree where every leaf is an array of maximum 32 elements. Bottom leaves other than the rightmost are assumed to be filled and when you lookup an index the node you're looking for is found by pure math, which is why midlist insert is O(N).

My plan is:

On top of this basic plan, I may attempt the following two nice-to-haves:

I haven't done any proofs this is actually efficient, but my intuition is that because removal is generally only done at the beginning and end and insertion cannot reduce leaf size below 16, that this will give between O(log_16(n)) and O(log_32(n)) performance on insertions, front pops and end pops, and this is provable.

This approach would resolve the problem in my application. If I get it working, would you be interested in merging it? What requirements do you have that a new feature like this has adequate perf (ie, do you require proofs, or just verification of good average-case performance in testing?)

Things I'm not clear on yet:

I am also seeing a problem where if I run npm install in an immutable-js checkout, I get node-gyp errors related to the v8 headers and ending in "Failed at the microtime@2.1.8 install script". This is on OS X 10.13.6 with npm 6.14.8, node v14.10.1. I have not investigated this yet. I do not get this error when npm installing a project with immutable-js as a dependency so I assume this is a problem with one of the devDependencies. EDIT: fixed this with PR #1792. However I am still unable to run npm run build or npm run test because I get "Error: Cannot find module 'react/addons'", and I am seeing this mysterious error when I run git run perf:

Pearl:immutable-js mcc$ npm run perf

> immutable@4.0.0-rc.12 perf /Users/mcc/work/h/tempt2/fork/immutable-js
> node ./resources/bench.js

ugh Error: Command failed: git show master:dist/immutable.js
fatal: Path 'dist/immutable.js' exists on disk, but not in 'master'.

    at ChildProcess.exithandler (child_process.js:308:12)
    at ChildProcess.emit (events.js:314:20)
    at maybeClose (internal/child_process.js:1047:16)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:287:5)

I can't even figure out what script is trying to invoke git show.