Implementing a QR decomposition Modeling Method for scp
Current Challenges in the scp Linear Regression
The current implementation of linear regression estimation in scp relies on simple matrix computation methods. While this approach is straightforward, it has several limitations:
Computationally intensive This method is quite computationally heavy compared to the QR decomposition.
Rank Deficiency: In cases where the model exhibits rank deficiency, the current method could struggles to provide accurate estimates. To address this, a ridge regression technique is used as a workaround. However, this solution is not ideal because it introduces additional complexity and allow to estimate all coefficients even if they are not estimable.
Lack of Interaction Terms: The current implementation does not support the inclusion of interaction terms in the model formula.
Implementing QR Decomposition for Linear Regression
This project aims to implement a QR decomposition method in scp for computing the linear regression. This new approach will be integrated through the lm.fit function from the stats package. This new implementation will have some advantages:
Computationally efficient: This method is the most efficient method to compute linear regression solutions.
Handling Rank Deficiency: With QR decomposition we can estimate rank deficient models by removing parameters with the lowest pivoting ranks. This way we return to a full rank model.
Support for Interaction Terms: lm.fit allows the use of interaction term in the formula.
Structural Changes to scp
In addition to changing the linear regression computation method, this project will also modify how different elements of the scpModel are accessed and stored:
Elements such as effects matrices (scpModelFitEffects), residuals (scpModelFitResiduals), p (scpModelFitP), and n (scpModelFitN) will no longer be stored as slots in the scpModelFit class. Instead, these elements will be computed on the fly when needed, at the level of the scpModel class. By computing these elements dynamically rather than storing them, the memory consumption of the scpModel object will be significantly reduced.
But despite the shift to on-the-fly computation, access times for these elements will remain reasonable due to efficient matrix computations.
User Interface Continuity
Importantly, these changes will be implemented without affecting the interface used by the user. All the exported functions will still have the same arguments and return the same output.
Implementing a QR decomposition Modeling Method for scp
Current Challenges in the scp Linear Regression
The current implementation of linear regression estimation in scp relies on simple matrix computation methods. While this approach is straightforward, it has several limitations:
Computationally intensive This method is quite computationally heavy compared to the QR decomposition.
Rank Deficiency: In cases where the model exhibits rank deficiency, the current method could struggles to provide accurate estimates. To address this, a ridge regression technique is used as a workaround. However, this solution is not ideal because it introduces additional complexity and allow to estimate all coefficients even if they are not estimable.
Lack of Interaction Terms: The current implementation does not support the inclusion of interaction terms in the model formula.
Implementing QR Decomposition for Linear Regression
This project aims to implement a QR decomposition method in scp for computing the linear regression. This new approach will be integrated through the
lm.fit
function from thestats
package. This new implementation will have some advantages:Computationally efficient: This method is the most efficient method to compute linear regression solutions.
Handling Rank Deficiency: With QR decomposition we can estimate rank deficient models by removing parameters with the lowest pivoting ranks. This way we return to a full rank model.
Support for Interaction Terms: lm.fit allows the use of interaction term in the formula.
Structural Changes to scp
In addition to changing the linear regression computation method, this project will also modify how different elements of the
scpModel
are accessed and stored:Elements such as effects matrices (
scpModelFitEffects
), residuals (scpModelFitResiduals
), p (scpModelFitP
), and n (scpModelFitN
) will no longer be stored as slots in thescpModelFit
class. Instead, these elements will be computed on the fly when needed, at the level of thescpModel
class. By computing these elements dynamically rather than storing them, the memory consumption of thescpModel
object will be significantly reduced. But despite the shift to on-the-fly computation, access times for these elements will remain reasonable due to efficient matrix computations.User Interface Continuity
Importantly, these changes will be implemented without affecting the interface used by the user. All the exported functions will still have the same arguments and return the same output.