Closed neilcaithness closed 5 years ago
Not so unexpected I guess. The half obscured vector off to the left is Hour
and this dataset spans 13 hours (night-time with fewer data points on the right. I'll either remove it or provide an unordered one-hot encoding.
I was hoping for more resolution on the smaller vectors, but perhaps this is exactly right for what I've given it.
I'd still very much appreciate any comments so I'll leave this here for a few days before closing the issue.
So late reply but thanks for the issue. Doesn't necessarily look off to me.. For diagnosis I guess you could try to check R2 per observation with 2 PCs with and without including a strong outlier to check what algorithm/preprocessing etc would work best.
First, thanks for the package.
I want to use your
robustSvd
in an attempt to reduce model distortion by extreme outliers. Here are two outputs from the same dataset, the first usingbase::svd
and the second usingpcaMethods::robustSvd
.I get similar unexpected outcome with the Iris dataset
iris[,-5]
but especially marked distortions if I include one-hot encoded variables forspecies
. In all cases I centre and standardize.Any comment would be greatly appreciated.
Best regards Neil