Closed gomezzz closed 2 years ago
Current train of thought, in general independent of choice of problem. As of now, I have three vague approach ideas:
Let's start with naive approach first and see where it goes.
Next steps:
Start with a single matrix and overfit on that to see if you can invert that
Then move to a few different ones, to see if you can fit and so on ....
maybe aiming for a decomposition could be more tractable?
Progress so far includes that the pipeline seems okay, so far ive managed (over-)fitting a ReLU 6-layer MLP to 1000 matrices with some sort of accuracy. The off-diagonal elements of X*X_inv_pred are of approx. 10^-3. For fewer matrices its smaller, obviously. With 8 layers, I get reasonable yet not satisfactory results when trained on 10000 matrices, off-diagonals of max 10^-2 . Diagonal is close to 1. All this is for matrices that are similar - they have eigenvalues close to each other and are simultaneously diagonalizable so they have pretty homogenous structure.
The model actually seems to have a bit of generalizability for new matrices, at least when they are generated via the same function. Emphasizing small and not very stable, but indicates that the model is indeed learning something. Perhaps implies that a first application could be in a specialized setting, where the matrices share a known similar structure! Steps forward:
Once we decide on the problem, we should try to find a minimal subproblem to start with (e.g. matrix inversion, start with 3x3 real matrices). Then, we can build a prototype notebook for that and from that decide how to make it into a nice module.
If we have ideas about it already, we might already consider different matrix sizes etc.