Closed peterTorrione closed 12 years ago
Maybe add a method: MERGE
dsMerge = merge(ds1,ds2)
Merge combines all of the information in ds1 and ds2 to form dsMerge. dsMerge contains all of the observations from ds1 and ds2, and the targets and class-labels in dsMerge are generated using the classNames in each of the original dataSets. Contrast to catObservations, where the targets in dsMerge are generated from the .targets field in ds1 and ds2. Unlike catObservations, Merge can combine data sets with different class names. Also unlike catObservations, the .targets field of dsMerge is not guaranteed to match the concatenation of the .targets field of ds1 and ds2.
also
dsMerge = merge(ds1,ds2,...)
What is the current status of this? I see that it has been reopened.
and then I reponed it again. Stupid comment & close button...
catObservations now does (I think) what it should.
Tests are in
prtTestCatObservationsCatNames
And are close to complete; this seems to work for now
Some weird examples.
ds1 has class 1, name "tnt" ds2 has class 1, name "rdx"
Should I be able to catObservations these two? Right now I can't since, TNT and RDX conflict in class integer space. Maybe allow catObservations(ds1,ds2,'-force') to force the PRT to merge them so RDX is now class 2?
Also, say
ds1 has classes 1 and 2, names tnt and rdx ds2 has classes 2 and 3 names hme and un
ds1 = ds1.retainClasses('tnt'); catObservations(ds1,ds2) this also errors, since the internal understanding inside ds1 is that "2" still corresponds to "rdx".
Two questions here:
1) Should we check the class name cache all the time and reduce it, so if there aren't any "2"'s, the object doesn't know about "rdx"? This seems weird in some ways, but correct in others.
2) Should we default to "-force" in catObservations?
3) More to the point, should we hide "targets" and rely more on interfaces through "class" strings? This seems like a pain, but if catObservations is messing with .targets (which it has to do to make the above work, then maybe this is all we can do?)