Closed yunqiyang0215 closed 1 year ago
do you really transform the data both ways? I would think you can just keep the original data around. That is, i) compute transformed data (if V has changed) ii) apply updates to estimate transformed Us iii) reverse transform the Us but you never have to reverse transform the data because you know the data...?
maybe i am misunderstanding though
This sounds like a good solution. And I changed the code based on the above three steps. It works. Then the question is if we want to perform the transformation every iteration or only at the first iteration and then update on the transformed data and U all the time.
Sorry, this was my attempt at making the code more elegant by keeping track of everything in a single "fit" data structure (including the X matrix). I agree that there are problems with this approach (as I discussed with Yunqi in person).
Then the question is if we want to perform the transformation every iteration or only at the first iteration and then update on the transformed data and U all the time.
If V is unchanging, then yes in the i.i.d. case I don't see any reason to apply the tranformation more than once.
It will be helpful to have tests to check that our i.i.d. calculations (after using this transformationn trick) give the same result as non-i.i.d. calculations (and I think we do have some tests already).
In the current implementation, data transformation is performed every iteration. With numerical errors cumulating, X and U0, Q in scaled U will be changed. These are not supposed to be changed.
The main question here is that it might be better to do data transformation only twice. One before iterations start and one after iterations end. The problems here are:
That form of implementation may not work if we want to update V. However, we assume V is known in our model and we don't want to update it for now. Later on, I am not sure if we want to spend effort on estimating V. Also, not sure if the way to update V is correct in the current version.
If we only do transformation twice, all the evaluations log-likelihood, the difference in w, U will all be based on transformed data. Not sure if it is appropriate.