konn / computational-algebra

General-Purpose Computer Algebra System as an EDSL in Haskell
http://konn.github.io/computational-algebra/
BSD 3-Clause "New" or "Revised" License
92 stars 9 forks source link

Improve representation of monomials #15

Open sheaf opened 3 years ago

sheaf commented 3 years ago

Currently, monomials are represented as unboxed vectors of Ints. This seems rather inefficient: an unboxed vector stores an offset, a length, and the data is stored byte-per-byte instead of word-per-word.

Two ideas come to mind instead:

newtype Monomial ( n :: Nat ) = Monomial ( Array# Word )

or, avoiding all indirections entirely:

data family Monomial ( n :: Nat ) :: TYPE ( 'TupleRep ( Replicate n WordRep ) )
newtype instance Monomial 0 = M0 (# #)
newtype instance Monomial 1 = M1 (# Word# #)
newtype instance Monomial 2 = M2 (# Word#, Word# #)
newtype instance Monomial 3 = M3 (# Word#, Word#, Word# #)
newtype instance Monomial 4 = M4 (# Word#, Word#, Word#, Word# #)
...

I'm also wondering if a Map is the best data-structure to use to represent polynomials, but I suppose that's best left to a separate ticket.

konn commented 3 years ago

Thank you for suggestions!

I'm also wondering if a Map is the best data-structure to use to represent polynomials, but I suppose that's best left to a separate ticket.

Yes, repr. of polynomial deserves another issue - indeed, the desirable representation depends on the purpose. Groebner basis algorithms performs well with ordered structures - hence we are using Maps here by default. I once tried heaps as an alternative repr, but it didn't buy me a good speed up at that time . OTOH, multivariate factorisation works well with recursive, nested representation. Anyway, as we have classes abstracting over polynomials, so we can have multiple representation for specific use cases. And Map-based repr performs relatively well at average for the time being.

or, avoiding all indirections entirely:

I think Unboxed approach won't scale as it needs extra data insts and makes it hard to implement folding (e.g. for monomial ordering and sugars) generically. IMHO, its performance gain doesn't deserve the effort to rewrite codes entirely (could be wrong though).

Array seems good, but then we could use even more efficient primitive (unboxed) array: PrimArray. I think we can use my subcategories package to reduce the amonunt of code modification. As I'm a busy recently, so a pull request (including runtime and heap benchmarks) is really welcome!

konn commented 3 years ago

Ah, I've forgot that I'd alredy migrated to the most recent sized package; then just siwtching to Sized PrimArray should work.