Type-level distinctions in C interface

elemental / Elemental

Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction

Other

504 stars 110 forks source link

Type-level distinctions in C interface #77

Open jedbrown opened 9 years ago

jedbrown commented 9 years ago

As mentioned in #76, the type-level distinction between Matrix, DistMatrix, SparseMatrix, and DistSparseMatrix causes a combinatorial number of wrappers. This complicates writing generic interfaces such as Python on top, but may also be undesirable for the user who may want to compare variants or write code that can operate on multiple matrix types.

poulson commented 9 years ago

@jedbrown What sort of an abstraction on top of these matrix types would you propose? The vast majority of Elemental's code is heavily dependent upon which of the above matrix types is being operated on, but I agree that it would be nice for user-facing code not to care.

But with such an abstraction, it would then become necessary to handle conversions. For example, if A is a sparse matrix and I form C = A^T A, which has a very high nonzero count, should it be converted to dense? And should the user have a way to manually do so?

jedbrown commented 9 years ago

What about having a single user-visible matrix type with different constructors and with conversion routines. The simplest implementation is a tagged union, but it could be made extensible.

One way to handle specification of output types for matrix operations is to have the output argument be "created" before calling the function. That creation could be something generic, something that specifies the type, or a fully-constructed matrix (in which case the allocated memory is to be reused).

poulson commented 9 years ago

Several of the matrix classes have equivalent constructors right now (e.g., DistMatrix, DistSparseMatrix, and DistMultiVec also assume the user wants to distribute over MPI_COMM_WORLD by default); what sort of defaults would make sense for the user-level matrix type?

And, as a practical matter, what I'm most concerned with on this issue is good naming conventions. As you're well aware, Elemental was initially designed with only dense matrices in mind, and so the names DistMatrix and AbstractDistMatrix referring specifically to dense matrices is a bit regrettable. I think that, if the matrix type system is to be enriched (or abstracted further), it would be a good idea to come up with more consistent (and perhaps shorter) names. For this reason, this might be best planned for inclusion in 0.87 (though I would be happy to start a branch now).

jedbrown commented 9 years ago

My personal view is that in general, the communicator is of such fundamental semantic importance that it should not be hidden from the user. I know a lot of people get away with assuming MPI_COMM_WORLD, but I hate it. Anyway, even if you hide it from some high-level interface, it's critical that it be easy to choose a different communicator.

As for constructors, you can keep implementation-specific constructors but just make them all return the same matrix type (different implementations).

If you only have one user-visible type (e.g., the tagged union), why not call it ElMatrix? All the distribution (and perhaps precision) variants can happen behind that interface. Surely that's what a Python user would expect to see.

poulson commented 9 years ago

Just to clarify, by no means do I always assume MPI_COMM_WORLD, it is just the default value of the communicator argument to the constructors of my distributed objects. I think it is definitely the right default value (and it is immediately duplicated within the constructor).

poulson commented 9 years ago

And I agree that ElMatrix would be the most natural name for the tagged union. There will need to be a Great Name Change.