Closed theogf closed 2 years ago
Thanks for the feedback! I guess the bijective
keyword makes more sense but would need even more explanations, yet better than what there is.
Merging #61 (316d739) into master (40d2790) will increase coverage by
0.22%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #61 +/- ##
==========================================
+ Coverage 94.44% 94.66% +0.22%
==========================================
Files 8 8
Lines 72 75 +3
==========================================
+ Hits 68 71 +3
Misses 4 4
Impacted Files | Coverage Δ | |
---|---|---|
src/likelihoods/categorical.jl | 100.00% <100.00%> (ø) |
|
src/links.jl | 84.21% <100.00%> (+1.85%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 40d2790...316d739. Read the comment docs.
That should be ready to merge, maybe we can merge #41 at the same time?
Maybe you could add some
@inferred
tests?
I am sorry but I am not sure how it would work. I should just have lines like
@inferred CategoricalLikelihood(args...; kwargs...)
?
Ah I see now, @inferred
will fail with Bool
arguments but pass with Val{true/false}
@devmotion Somehow I am not happy with the current setup.
CategoricalLikelihood
is not an AbstractLink
(as we try to enforce for most likelihoods right now)Link
responsible for deciding the variant (the likelihood itself is not really bijective)How about instead having the variants included as a wrapper for the Link
s, like
struct BijectiveSimplexLink{L} <: AbstractLink
link::L
end
(l::BijectiveSimplexLink)(f::AbstractVector{<:Real}) = l.link(vcat(f, 0))
# etc...
I'm not so much worried about type parameters (one could also rearrange them) but I think the second point is valid. However, I wonder if all these considerations are valid/relevant for anything else than the softmax
function. Maybe it would be even more "correct" to just define two softmax links, one that is bijective and one that isn't. And make the bijective one the default but mention the other one in the docstring. And don't change the Categorical...
type at all?
all these considerations are valid/relevant for anything else than the softmax function
There are already two other links in this situation the "Heaviside" one (typically pdf(y=i) = i == findmax(f)
) or from my own work the logisticsoftmax
where instead of exp
you use logistic
. This would nicely wrap around those.
I see, in this case it seems reasonable to define the link types in the general fashion you outlined and allow other wrapped links besides softmax
. But even in the general case we should only change the default link but not the Categorical...
type with this design, I think.
This solves #55 by having two variants (Simplex and Curved). This add a new parametric type to
Categorical
.SimplexVariant
(the original version) andCurvedVariant
(the overparametrized version).SimplexVariant
is pretty straightforward to understand I believe (N-1 inputs for N categories)CurvedVariant
comes from the fact that such a parametrization leads to a "Curved Exponential Family" (see the small text after the table in https://en.wikipedia.org/wiki/Exponential_family#Table_of_distributions)I am really open to other naming, especially the
CurvedVariant
one. MaybeOverparametrizedVariant
or something like this?