libmir / mir-algorithm

Dlang Core Library
http://mir-algorithm.libmir.org
Other
174 stars 37 forks source link

labels support #426

Open 9il opened 2 years ago

9il commented 2 years ago

Minimal support:

Advanced support:

jmh530 commented 1 year ago

Also note that functions that take a slice as input can't accept a slice with labels. I show using isMatrix below, but isConvertibleToSlice also works. So functions that don't have an overload for isConvertibleToSlice should if possible.

/+dub.sdl:
dependency "mir-algorithm" version="*"
+/

import mir.ndslice.allocation: slice;
import mir.ndslice.slice: Slice, SliceKind;
import mir.ndslice.traits: isMatrix;

void foo(SliceKind kind)(Slice!(double*, 2, kind) x) {}
void bar(M)(M x) if(isMatrix!M) {}

void main()
{
    auto dataframe = slice!(double, string)(4, 3);
    foo(dataframe); //does not compile
    bar(dataframe);
}
jmh530 commented 1 year ago

toSlice is another one of the functions (among many) that suffers from this problem. The problem is that relying on something like isSlice!S may result in a lot of template bloat. So what you would have to do is have some functions that constraint on isSlice!S and that just forward to some implementation function after stripping the labels.

One related problem is that there is a bug (#23924) related to how DMD handles overloads with template enum parameters combined with variadic templates that makes some of this more difficult than it should be.

Another solution that had been discussed in the past is to have DataFrame as a separate type that takes a slice as an alias this member.

import mir.ndslice.slice;

struct DataFrame(Iterator, size_t N, SliceKind kind, Labels...)
{
    Slice!(Iterator, N, kind) x;
    alias x this;
}

void foo(Iterator, size_t N, SliceKind kind)(Slice!(Iterator, N, kind) x) {}

void main()
{
    DataFrame!(double*, 1, Contiguous, string) x;
    foo(x);
}
jmh530 commented 1 year ago

Eh, maybe a better approach would be define something like

enum bool isSliceExact(T) = is(T == Slice!(Iterator, N, kind), Iterator, size_t N, SliceKind kind);

and then use it like

void foo(Iterator, size_t N, SliceKind kind)(Slice!(Iterator, N, kind) x) {}
void foo(T)(T x) if(isConvertibleToSlice!T && !isSliceExact!T) {x.toSlice.foo}

This should avoid the problem of matching both of them. It should work with slices that have labels as well.

9il commented 1 year ago

Labels may be a breaking change. mir-algorithm and mir-ion have to try to avoid them until at least 2026.

jmh530 commented 1 year ago

What happens in 2026?

9il commented 1 year ago

The thing is that I am going to spend far less time with Mir. But still have to support it for a few years. On the other hand, we can add new code if that will not affect existing behavior.

jmh530 commented 1 year ago

Thanks. I was just confused by the reference to 2026. Your previous comment made it seem like improved labels support might come, but not until at least after 2026. This most recent comment suggests the support is not likely to come at all.

If one of my PRs is adding too much new code or changes, just let me know and we can close. I've got an outstanding lubeck PR, but I'm thinking about just moving the functionality I need into mir-stat.

9il commented 1 year ago

Actually, if the API doesn't change existing code in Lubeck, you can add new API ther.

jmh530 commented 1 year ago

@9il See here. I don't introduce breaking changes, but I am making changes to existing code (the diff in some cases makes it look like there are more significant changes than there are).

My objective from these changes is to have multivariate normal densities in mir.stat. Having functions to handle quadratic forms and Mahalanobis distance is useful in this, and improving the handling of matrix multiplication makes writing those functions simpler. It doesn't really matter to me where this stuff is located.