Closed axiqia closed 5 years ago
arrgh! This might actually be a bug- I think transformInputs just copies the label container from raw_data, therefore the labels are still shared with raw_data.
could you try whether the following before splitting helps?
data.labels().makeIndependent()
I have to ask myself whether the independence check is doing more harm than good here.
It works :)
Dear @Ulfgard , Is there a API to transform a RealVector using a Normalizer? e.g.
//load scaler from the file
ifstream ifs2(argv[3]);
TextInArchive ia2(ifs2);
Normalizer<RealVector> normalizer2;
normalizer2.load(ia2, 0);
ifs2.close();
double param[] = {
241.28,1, 1, 0, 1, 0, 1, 1, 1, 14745600, 14745600, 14745600, 14745600, 1, 0 ,0 ,1, 1, 1, 14745600, 14745600, 14745600, 14745600, 1
};
RealVector onetest(23);
std::copy(param, param+23, onetest.begin());
RealVector points = transform(onetest, normalizer2);
I only find shark::transform (Data< T > const &data, Functor f)
here .
Thank you :)
normalizer is a model, so
normalizer2(onetest)
it would also work for Data
normalizer is a model, so
normalizer2(onetest)
it would also work for Data as argument if you happen to have many data points
So I can also use normalizer2(data.inputs())
and the result is the same astransformInputs(data.inputs())
Do I understand correctly?
Yes,
this will internally just call transform, so you are fine. This is pure convenience
From: axiqia [notifications@github.com] Sent: Wednesday, October 24, 2018 4:09 PM To: Shark-ML/Shark Cc: Oswin Krause; Mention Subject: Re: [Shark-ML/Shark] [SharedContainer::splitBlock] Container is not Independent (#259)
normalizer is a model, so
normalizer2(onetest)
it would also work for Data as argument if you happen to have many data points
So I can also use normalizer2(data.inputs()) and the result is the same as transformInputs(data.inputs())
Do I understand correctly?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Shark-ML/Shark/issues/259#issuecomment-432673040, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AOWTBhtLqO9cuHoJGnSSR00MTsEWBs7Aks5uoHSegaJpZM4X3qWX.
exactly
`normalizing_train.train(normalizer_train, training_data.inputs())'
this should fix your issue. Please remember not to train a normalizer on the test set and use
normalizedData_test = transformInputs(test_data, normalizer_train);
Originally posted by @Ulfgard in https://github.com/Shark-ML/Shark/issues/51#issuecomment-189819470
After I normalized the raw data, I got an runtime error when I splitted the data into dataTest.