wpilibsuite / allwpilib

Official Repository of WPILibJ and WPILibC
https://wpilib.org/
Other
1.05k stars 612 forks source link

Could you add a isStale method to NetworkTableEntry #2630

Closed JimWright4089 closed 1 year ago

JimWright4089 commented 4 years ago

I like that NetworkTableEntry has .getLastChange(), however, it would be nice if it had a method called isStale. isStale is a method that returns a boolean to tell if the last changed time is older than a static user set timeout value. This gives a quick way to tell if my code should rely on the entry or not.

Describe the solution you'd like public boolean isStale(NetworkTableEntry entry) { if(HALUtil.getFPGATime()-entry.getLastChange()>mTimeout) { return true; } return false; }

With mTimout being a static attribute, with a static void setTimeout(long timeout) to set it.

Describe alternatives you've considered I would have just subclassed NetworkTableEntry to give me this functionality. However, NetworkTableEntry is defined as final, so I can't subclass it.

Additional context I'm working with a RPLIDAR A2 Lidar to add some object detection to Auto, to avoid running into your partner robots if they get in front of yours. I put the Lidar on a Jetson Nano, so that it can do the processing of the data (and eventually do SLAM as well). I send 35 points (-180 to 180 in 5 degree increments) to the Rio using the NetTables.

This Lidar seems to have a problem with edges, where the reading goes from 1 meter to 8 meters. The Lidar takes up to 2 seconds to get a quality reading. During this time I want the NetTable entry to go "stale" so that the code on the Rio knows it can't rely on it.

https://www.youtube.com/watch?v=pKU55_nYqT0 https://www.youtube.com/watch?v=v2nI9HwqaQw

calcmogul commented 4 years ago

The Lidar takes up to 2 seconds to get a quality reading.

Does this mean the data spans a 2 second time period, or that the data is instantaneous, but 2 seconds old?

If it's the latter, you can make an entry listener so you only process the data once. You should also send a timestamp so you can compensate for latency in the measurement. You can compensate for measurement latency by keeping a backlog of poses with timestamps, finding the transformation from the pose to the measurement at the matching timestamp, then applying that transformation to all poses after it. More complex methods with better filtering are available. Here's an example of the backlog logic, although it requires a model for the Kalman filter part, which might be overkill for you: https://github.com/mcm001/allwpilib/blob/state-space-v2/wpilibj/src/main/java/edu/wpi/first/wpilibj/estimator/KalmanFilterLatencyCompensator.java

I say all this because control loops operating on fast dynamics (FRC drivetrains) really don't like old measurements (100ms+ of latency). A "stale" flag addresses a symptom instead of fixing the problem.

prateekma commented 4 years ago

ntcore doesn't have access to the HAL, so your snippet wouldn't work

JimWright4089 commented 4 years ago

ntcore doesn't have access to the HAL, so your snippet wouldn't work

Sorry, I should have pointed out that the java was the functionality I wanted.

The full request would be: In the ntcore cpp storage class add a field for stale for the TableEntry, as well as the settor.

Then in ntcore_cpp.cpp add this function:

bool IsStale(NT_Entry entry) { uint64_t lastChanged = GetEntryLastChange(entry);

Handle handle{entry}; int id = handle.GetTypedIndex(Handle::kEntry); auto ii = InstanceImpl::Get(handle.GetInst()); if (id < 0 || !ii) return false;

uint64_t staleTime ii->storage.GetEntryStaleTime(id);

return (wpi::Now() - lastChanged) > staleTime; }

And then add all of the CPP/Java access code methods for this function.

JimWright4089 commented 4 years ago

Does this mean the data spans a 2 second time period, or that the data is instantaneous, but 2 seconds old?

isStale returns false until the 2 seconds after the last write to the entry. If the entry is written to every 200ms, the isStale will always be false.

Here is an example: If the stale time is set to 2 seconds.
Time 0000ms: Entry "ZeroDeg" is set to 1000.00. Time 0010ms: isStale is called on entry "ZeroDeg" it returns false. Time 2000ms: isStale is called on entry "ZeroDeg" it returns false. Time 2001ms: isStale is called on entry "ZeroDeg" it returns true. Time 2011ms: isStale is called on entry "ZeroDeg" it returns true. Time 2012ms: Entry "ZeroDeg" is set to 1030.00. Time 2022ms: isStale is called on entry "ZeroDeg" it returns false. Time 3000ms: Entry "ZeroDeg" is set to 900.00. Time 3001ms: isStale is called on entry "ZeroDeg" it returns false. Time 5000ms: isStale is called on entry "ZeroDeg" it returns false. Time 5001ms: isStale is called on entry "ZeroDeg" it returns true. Time 6000ms: isStale is called on entry "ZeroDeg" it returns true.

calcmogul commented 4 years ago

I'm still having trouble understanding the data format your LiDAR is giving you. In what format is the data delivered and when? Is it like a streaming protocol, or is the data delivered by the sensor atomically? I need more context about your system to provide clearer advice and make sure your suggested solution is the optimal one for your use case.

JimWright4089 commented 4 years ago

So the Lidar is connected to a Jetson Nano through a serial port. The Jetson Nano has the ntcore library compiled onto it. The Jeston takes in a stream of data, parses it and loads it into 360 buckets then, to save bandwidth, sends only 35 readings to the RoboRio through NetTables. So the publisher of the NetTableEntry is the Jetson Nano, the Server is the RoboRio, and the subscriber is a TimedRobot program, controlling the motors.

JimWright4089 commented 4 years ago

I need more context about your system to provide clearer advice and make sure your suggested solution is the optimal one for your use case.

I've got a wall following robot. I've got a Lidar subsystem that subscribes to all of the 35 readings. The rough wall following algorithm the 60 degrees NetTableEntry. It wants to use 60 degrees as the input into a PID controller that tries to keep the 60 degree reading 800mm away from the wall. The Lidar has a problem when it reaches the end of a wall. If everything was working as indented, the robot would notice the reading going from 800mm to 9 meters and make a fairly hard right turn.

However, the Lidar, at the end of the wall, has trouble with the big jump and needs some time adjusting about 1 second worth of time before it gives a good 60 degree reading. So the robot still thinks there is a wall there. If left on it's own, by the time the Lidar catches up, the hard right comes too late and the robot looses the wall.

So, in my subsystem I have the function to look at getLastChanged, subtract it from the current time, and compare it to 500ms. If it's true I know the lidar is in this odd catchup state, so I slow the robot down to a crawl until the 60 degree NetTable entry is "not stale".

I saw your response on the Kalman Filter and that is one of the next things to add. However, I was working on a large power controller for an Airplane and we have the stale flag on our core database entries and it was extremely useful, so I thought, it would be a nice feature to add to the ntcore.

JimWright4089 commented 4 years ago

I need more context about your system to provide clearer advice and make sure your suggested solution is the optimal one for your use case.

One of the test use cases we had on the Learjet, was, before you engaged the thrust reverser, all of the data from the engine had to read good, and not stale (our frame was 20ms, and our timeout was 21ms), so that you did not cause the pilot to have a bad day.

ewpratten commented 4 years ago

I overcame this problem by implementing the following system that you may be interested in:

Have the Jetson constantly publish a reading from it's system clock, and the timestamp of the most recent packet of data from the lidar system. On the Rio side, check if the packet timestamp falls too far apart from the Jetsons system clock.

Make sure not to rely on the Rios clock for this, because on-robot NTP can be slow (I have had a robot go entire matches without fetching the correct time from the driverstation )

Daltz333 commented 3 years ago

@PeterJohnson Is this still relevant in NTv4?

rzblue commented 1 year ago

Digging up an old issue, but this is probably not relevant with NT4 publishers in the default transitory mode.

PeterJohnson commented 1 year ago

There’s no specific stale function for NT4, however NT4 does introduce the readQueue function which only gives you new values, and empty otherwise. Also timestamps can be used to do this yourself. With multiple alternative solutions available, I’m closing this.