Given the results from the spherical test case, things look promising. The next class of shapes to train would involve the platform. I've been going back and forth on this but I think we should start with encoding orientation in the geometry latent code rather than kinematic state since the sphere is the only object that moves in our pilot domain and its motion doesn't depend on its orientation.
This could be a good task for you, Jack as it can catch you up to speed with the INRs but you may have other priorities for the kinematics dataset
Kinematics
Some details here will depend on the current format of the physical traces stored in your repo, Jack.
INR outline
Similar to the geometry INR, the kinematics function with take a queries consisting of time, t, and kinematic state ,k. For now, since we are encoding orientation in the geometry latent code, k will consist of position and linear velocity. the kinematics INR will then return the probability value of k given the kinematic latent code z_k for the time query.
Queries will be generated in a similar way to geometry, returning 1.0 roughly 50% of the time. This can be acheived using the following procedure:
obtain the positive queries (returns 1.0) from the ground truth simulation for some window of t (let's use 1 second or 60 frames).
normalize k to be centered around 0 and with sd 1.0
we can keep t in seconds and with a reference frame relative to z_k. So t = [0, 1] for a batch of 1 seconds
Generate the negative queries (returns 0.0) by perturbing the k in the positive query with some gaussian noise setting sigma to some value larger than the inter-frame distance of the gt (maybe 3x).
Training
We can then apply the same work flow as the geometry datasets and INRs for training. Since all INRs in this project return scalars, we will not need to change many parameters.
We will also need to extend our visualization tools for the kinematics domain. For a given time t, we can optimize k and plot a scatter plot where color is the output probability. Since output probabiliy is monotonic we can use -log(p) for optimization. We can generate a cloud of points by jittering k to get a sense of the variance.
For plotting, we can use a similar scheme as geometry where we show slices of k across time. Another possibility is to directly use volumetric plots such as https://plotly.com/python/3d-volume-plots/ .
Note: such volumetric plots would be great for geometry as well
Depthmap observations
I'm very excited about making progress here. The main plan is to take our geometric and kinematic latent codes to modulate an INR for depth map rendering. Unlike the other INRs, this one will not directly fit modulations since we already learned the respective codes. As far as I know, this is a novel use of modulated Siren architectures.
Dataset
This dataset will again follow from the previous ones.
Query space here consists of image coordinates x = [-1, 1] and depth values d = [0, 1]. The output space is again a probability value.
a pre-rendered depth map can be loaded from disk and a subset of pixel queries will be sampled. queries again can be sampled around each object such that roughly half will have p = 1.0
Since we plan to re-use the information in the geometry and kinematic latent codes for each object, a training trial will require more than just queries and outputs. z_g, z_k can be generated for the time point by inferring from the GT geometry and kinematic data
Training
For a given trial, there will be no fitting procedure in the inner loop. Instead, the fitted z_g, z_k will be concatenated and passed directly to the modulated to generate phi. The outerloop remains the same, updating on the gradients of the batch-average loss.
Visualization
We can just generate depth-map images by optimizing a depth-map matrix given fixed pixel coordinates, z_g, z_k.
Geometry
Given the results from the spherical test case, things look promising. The next class of shapes to train would involve the platform. I've been going back and forth on this but I think we should start with encoding orientation in the geometry latent code rather than kinematic state since the sphere is the only object that moves in our pilot domain and its motion doesn't depend on its orientation.
With that design the relevant action items are:
Creating the new dataset
PlatformGeometryDataset
with the same interface as https://github.com/CNCLgithub/Cusanus/blob/30f35642b33d16ae978066516096617cd90be39a/cusanus/datasets/geometry.py#L180
if distance > 0, otherwise 1. (similar tospherical_occupancy_field
https://github.com/CNCLgithub/Cusanus/blob/30f35642b33d16ae978066516096617cd90be39a/cusanus/datasets/geometry.py#L57-L61))qs
such that roughly 50% are within the obstacleMerging with spherical geometry
This could be a good task for you, Jack as it can catch you up to speed with the INRs but you may have other priorities for the kinematics dataset
Kinematics
Some details here will depend on the current format of the physical traces stored in your repo, Jack.
INR outline
Similar to the geometry INR, the kinematics function with take a queries consisting of time,
t
, and kinematic state ,k
. For now, since we are encoding orientation in the geometry latent code,k
will consist of position and linear velocity. the kinematics INR will then return the probability value ofk
given the kinematic latent codez_k
for the time query.Queries will be generated in a similar way to geometry, returning
1.0
roughly 50% of the time. This can be acheived using the following procedure:t
(let's use 1 second or 60 frames).k
to be centered around 0 and with sd 1.0t
in seconds and with a reference frame relative toz_k
. So t = [0, 1] for a batch of 1 secondsk
in the positive query with some gaussian noise setting sigma to some value larger than the inter-frame distance of the gt (maybe 3x).Training
We can then apply the same work flow as the geometry datasets and INRs for training. Since all INRs in this project return scalars, we will not need to change many parameters.
The main parameters to change will involve the Siren hyperparameters https://github.com/CNCLgithub/Cusanus/blob/30f35642b33d16ae978066516096617cd90be39a/cusanus/archs/siren.py#L34-L44 . We might need to extend the api to change these from a config file
Visualization
We will also need to extend our visualization tools for the kinematics domain. For a given time
t
, we can optimizek
and plot a scatter plot where color is the output probability. Since output probabiliy is monotonic we can use-log(p)
for optimization. We can generate a cloud of points by jitteringk
to get a sense of the variance.For plotting, we can use a similar scheme as geometry where we show slices of
k
across time. Another possibility is to directly use volumetric plots such as https://plotly.com/python/3d-volume-plots/ .Depthmap observations
I'm very excited about making progress here. The main plan is to take our geometric and kinematic latent codes to modulate an INR for depth map rendering. Unlike the other INRs, this one will not directly fit modulations since we already learned the respective codes. As far as I know, this is a novel use of modulated Siren architectures.
Dataset
This dataset will again follow from the previous ones.
x = [-1, 1]
and depth valuesd = [0, 1]
. The output space is again a probability value.z_g, z_k
can be generated for the time point by inferring from the GT geometry and kinematic dataTraining
For a given trial, there will be no fitting procedure in the inner loop. Instead, the fitted
z_g, z_k
will be concatenated and passed directly to the modulated to generatephi
. The outerloop remains the same, updating on the gradients of the batch-average loss.Visualization
We can just generate depth-map images by optimizing a depth-map matrix given fixed pixel coordinates,
z_g
,z_k
.