Add HMatrix Backend - Githubissues

freckletonj commented 2 years ago

I've begun adding instances, and have only accomplished the trivial ones. You're right, there's much to think about wrt TensorSpace.

leftaroundabout commented 2 years ago

Thanks! That's a good start.

So you've decided to make it TensorProduct (R n) w = [w] for now... Yeah, that probably makes sense at first, we can benchmark it to see how much worse it performs than native hmatrix mat-mul operations.

Shall I fill out the TensorSpace methods? Could you take care of testing & benchmarking, for now using free-vector-spaces?

Have you formed some opinions with regard to the different options how to do these tensors properly?

freckletonj commented 2 years ago

I was just playing around, and didn't necessarily settle on TensorProduct (R n) w = [w]. I feel a bit out of my league even commenting on this, but:

type TensorProduct (HMatrix.R n) (HMatrix.R m) = HMatrix.L n m : I agree this seems ideal, but I'm not clear why we can't have it. Is the idea that we'd need to support something like tensorProduct (x :: R n) (y :: [Double])?
type family HMatRTensor n where ... : I agree this seems like a good option (based mostly on my type family intuitions), but that this seems like something the compiler would give grief over.
singletons : I have some experience with solving things with singletons. IME, once I found how to use them for a specific problem they made everything far easier. I'm not sure how that solution might look here, but, I'm happy to lend my experience if you can guide me on what that solution would look like. I think singletons have some performance implications, but, I'm happy to benchmark several solutions if we can make several.
typeable : I agree that your concerns seem concerning. I'm not sure how the issues would play out here, but, again IME, singletons were superior to trying to get a typeable constraint into a class.
"custom mechanism" : I'm not sure I see what that'd look like for this
type TensorProduct (HMatrix.R n) w = [w] : seems sad to leave the world of HMatrix arrays, but, I can benchmark it.
"hybrid solution" : is this basically the "custom mechanism" you mention above? It seems interesting. Although I don't get your performance degradation comment:

HMatRTensor n (HMatrix.R m) can just as well come wrapped in the HMatGenericTensor :: HMatRTensor n [w]

how can it? The GADT's types wouldn't align, so, I think this would be safe. I see your point if it were an ADT, but GADTs seem safe.

I'm in no rush for this, so, I'd be happy if we tried maybe a few approaches, and I'll benchmark them. Maybe we can have:

Numeric/LinearAlgebra/Static/Experiment/TypeFamily.hs
Numeric/LinearAlgebra/Static/Experiment/Singletons.hs
Numeric/LinearAlgebra/Static/Experiment/TensorList.hs
Numeric/LinearAlgebra/Static/Experiment/GADT.hs

That'd be great if you wanted to start filling out TensorSpace and I'll start benchmarking!

leftaroundabout commented 2 years ago

I agree this seems ideal, but I'm not clear why we can't have it. Is the idea that we'd need to support something like tensorProduct (x :: R n) (y :: [Double])?

Well, rather FinSuppSeq Double, but yes – a goal that has been important for me with this library is that it should be possible to have tensor products between any vector spaces, including infinite-dimensional ones, and not just the usual “tensor=matrix” reduction.

type family ... seems like a good option ... but that this seems like something the compiler would give grief over

Exactly what my experience / intuition also says.

I think singletons have some performance implications

Yes, I suppose. But I'd hope that this would be a one-time cost for an entire vector operation. I'd expect the HMatrix backend to be used for large vectors, where this would become insignificant (similar to how people use Python and get good performance because v + w is only calling NumPy once which has the v[i] + w[i] loop implemented in C).

We seem to agree that Typeable isn't very promising.

As for “custom mechanism” – well, I've hardly thought that through. The idea is something like a type family Backend v. This would be the same for e.g. all HMatrix types (likewise for all accelerate-based arrays or whatever). Then we could use Typeable on the backend and thereby allow two HMatrix spaces to “recognize” each other when you take the tensor product between them. Then this concrete Backend could be a GADT that encapsulates any constraint necessary to make an optimised representation for this particular tensor product.
Does that make any sense?

The “hybrid solution” is quite different from this. The problem I'm seeing is, the GADT

data HMatRTensor n w where
  HMatrixTensor :: HMatrix.L n m -> HMatRTensor n (HMatrix.R m)
  HMatGenericTensor :: [w] -> HMatRTensor n [w]

only ensures that HMatrixTensor is restricted to R n ⊗ R m products. We could easily use this constructor for identity matrices and then also for any sum that involves one of these constructors.
But the HMatGenericTensor constructor is more general, and it also allows to be used for the case R n ⊗ R m (as well as for any other tensor product). When taking a generic tensor product between two vectors, even if they both are HMatrix types, the compiler wouldn't take note of that and we would end up representing that tensor in the less-than-ideal [R m] representation.

leftaroundabout commented 2 years ago

I haven't forgotten about your PR, but you know how it is... the little time I could set aside for maintaining my old Haskell packages went mostly into a bunch of changes in manifolds that I wanted to make already for about two years. But that's now done and about to be published, and I could then focus a bit on this here.

Have you arrived at any further conclusions in the meantime as to how the backend problem should be addressed?

leftaroundabout commented 2 years ago

After repeatedly rolling the thoughts in my head, I like none of the options (for achieving the tensor products of hmatrix-vectors to be implemented as hmatrix-matrices) enough to justify the added complexity to the library. Specifically, all this would push the package more in a direction of being just another type wrapper around the old everything's a matrix paradigm, and though putting matrices on a more category-cal backing isn't worthless, that's just not at all in line with my grand vision for this library (getting rid of particular basis expansions).

There's still nothing wrong with having vector spaces whose implementation is based on hmatrix, Furthermore it's totally possible to create another category, which simply always uses hmatrix matrices under the hood. That could still be made compatible with the types from this library by restricting to FiniteDimensional and expanding everything in the canonical basis.

Meanwhile, for the TensorSpace instance I'd just stick with TensorProduct (R n) w = [w], even though that's not great for performance.

One change to the classes that would be relatively simple and could help with performance would be, instead of a full-blown variety of backends, to only add a method that decides whether or not the type is Unbox. Then the instances with a [w] tensor product could switch to VU.Vector w for those types where it's possible, specifically for small constant size types where boxing has a significant performance impact.

leftaroundabout commented 2 years ago

Yet another option, before I settle on one: add

class (...) => TensorSpace v where
  ...
  type StaticDimension v :: Maybe Nat

This would be allowed to be Nothing (certainly for infinite-dimensional spaces, perhaps also for large-dimensional ones), but for something like V3 we could then use a dense unbox-vector or hmatrix packing of the given basis expansion.

...So as to hopefully get the best of both worlds between generality and performance, though that may be optimistic.

leftaroundabout commented 1 year ago

Some heads-up on the (glacial-pace) progress with regards to this PR:

Added two classes, DimensionAware and Dimensional. The latter expresses essentially just “space v has dimension n”, so those can readily be put in any array-representation. The former is an explicitly-optional version of Dimensional, the purpose being to support both fixed-dimensional and e.g. infinite-dimensional ones in a common interface but still enable tensor implementations to cater for each case.
Changed the category used for coercing between representations of tensors. Previously this was just standard Coercion, but that pretty much requires that all tensors are ordinary Hask-functors underneath (which precludes implementations like hmatrix). The new VSCCoercion is at the moment still equivalent to that, but conceptually expresses that the coercion preserves the conventional basis expansion and certainly the dimension (if known). As a result, this could also be used to coerce between matrix-based representations of tensors over the spaces.

The further plan is to make DimensionAware a superclass of TensorSpace, make the dimension-fixing explicit in VSCCoercion, and add methods to Dimensional for converting to and from number-arrays.

@freckletonj do you have any feedback on this, to either the general roadmap or the concrete implementation? In particular, this is the first time I really used singletons, to express the dependent typing involved when we construct the optional-dimension of tensor products etc.. Any comments on that?

leftaroundabout / linearmap-family

Add HMatrix Backend #6