Closed rhyolight closed 8 years ago
@subutai @scottpurdy Any chance you guys can take a look at this?
BTW, this temporary fix has things working for me locally. So I can proceed with my experiments, but no one else will be able to run them.
@rhyolight - all of the documentation on getDescription
(e.g. this) suggests that the list indicates the bit offsets in the encoding. So it looks like the geospatial encoder incorrectly specifies the encoder inputs and rather than removing altitude, you could just have it return [("coordinate", 0)]
. Can you try that and see if it works?
@scottpurdy I don't understand where the coordinate
is coming from. Does it represent the entire vector with speed, lat/lon, and altitude?
When I replace return [('speed', 0), ('longitude', 1), ('latitude', 2), ('altitude', 3)]
with return [("coordinate", 0)]
I get the following error:
Traceback (most recent call last):
File "./htm/nupic/process-gpx-nupic.py", line 222, in <module>
run(inputPath, options.useTimeEncoders, options.scale)
File "./htm/nupic/process-gpx-nupic.py", line 82, in run
runOnePoint(point, model, useTimeEncoders, anomalyLikelihood)
File "./htm/nupic/process-gpx-nupic.py", line 142, in runOnePoint
result = model.run(modelInput)
File "/Users/mtaylor/nta/nupic/src/nupic/frameworks/opf/clamodel.py", line 395, in run
self._sensorCompute(inputRecord)
File "/Users/mtaylor/nta/nupic/src/nupic/frameworks/opf/clamodel.py", line 480, in _sensorCompute
sensor.compute()
File "/Users/mtaylor/nta/nupic/src/nupic/engine/__init__.py", line 458, in compute
return self._region.compute()
File "/Users/mtaylor/nta/nupic.core/bindings/py/nupic/bindings/engine_internal.py", line 1393, in compute
return _engine_internal.Region_compute(self)
File "/Users/mtaylor/nta/nupic/src/nupic/regions/RecordSensor.py", line 334, in compute
outputs["sourceOut"][:] = self.encoder.getScalars(data)
ValueError: could not broadcast input array from shape (5) into shape (3)
This is a familiar error, because I got it when I removed the "altitude" component from the input row vector completely:
modelInput = {
"vector": (speed, longitude, latitude, altitude)
}
This is how the input row is created in the geospatial example. When I simply removed the altitude
form the tuple, I got this same ValueError from the RecordSensor
, but with different shape values. If I implement the change as you suggested, it is going to obviously change the API of the encoder and we'd need to update our example code.
So I guess my question is, how do I change the input vector to adhere to this new input model?
@rhyolight - Ok, so it assumes 1:1 input scalars to output bit arrays. The geospatial encoder is explicitly not intended to run within the OPF (CLAModel). For this use case there are a couple options:
None
to 0
for the altitude inside the getScalars
function. This is what I would recommend.getDescription
, getScalars
, and anywhere else.I cannot do (1) because it increases the processing time insanely, I've already tried it.
I'll try (2).
I cannot do (1) because it increases the processing time insanely, I've already tried it.
I take that back, that's not exactly what I tried. I tried sending 0
as the altitude input in the input vector. I'll try your (1) above and see if it has the same effect on processing time.
Thanks @scottpurdy for leading me in the right direction via chat. See https://github.com/numenta/nupic/pull/3082 for fix that doesn't remove any altitude processing capabilities.
I ran into difficulties generating anomaly likelihood values with a GeospatialCoordinateEncoder (GCE), and I believe the root cause is that the
altitude
portion of the raw input vector.In all current examples of GCE usage, the input vector is created like this:
... where
altitude
may beNone
. This starts breaking when you use the anomaly likelihood post-process, because it introduces arrays withNone
values, and numpy complains when trying tomean
the data:I tried using
numpy.nanmean
but that didn't help. Another workaround is to use0
for altitude, but this drastically increases processing time by adding another dimension to the input that doesn't really exist.If you look at how the GCE encodes input data:
You can see that
altitude
is optionally included in the coordinate. However, the description always returns the altitude:I believe this is the root problem, but I'm not sure how to fix it. To get things running for my current project, I have just manually removed the
('altitude', 3)
from the description array, and I can now run NuPIC without analtitude
and post-process the anomaly likelihood properly.