StarlingUAS / ProjectStarling

BRL Flight Arena Infrastructure 2.0
Other
16 stars 3 forks source link

Add Pi SysID configuration #24

Closed rob-clarke closed 3 years ago

rob-clarke commented 3 years ago

Add a way to set SysID from a file in the Pi into the mavros entrypoint

rob-clarke commented 3 years ago

Proposed functionality:

If MAVROS_TGT_SYSTEM is nonzero, use that as the target system ID. If MAVROS_TGT_SYSTEM is zero, attempt to lookup from file. If file does not exist, fallback to IP-based generation

@mhl787156 Would that work in place of the existing setup where MAVROS_TGT_SYSTEM is set to an IP? Can a container get it's own IP under Kubernetes?

mhl787156 commented 3 years ago

I think that's a good idea, although there should definitely be an explicit check somewhere that the read MAVROS_TGT_SYSTEM matches the SYSID just in case.

A pod can get its own ip address using ifconfig/ ip a and some regex. Although I find it odd how ifconfig is in the image, but ip a is not. Just occurred to me that this would only work on networks with subnet masks <16 as well, otherwise we'll get collisions. Should be fine on standard 192.168.0.1/16, but must check if k8 network is /16 or /12, I have a feeling it might be the latter...

There is also chance of ID collision if both SITL and real drones are used which we need to be careful of. Partitioning ID space in one solution, but then IP generation may not work...

Unfortunately I can't think of another method apart from IP generation which would guarantee less collisions (apart from an explicit ID server...) so I think what you have suggested is still the best solution, but we need to make a note of the caveats.

mhl787156 commented 3 years ago

@rob-clarke Agreement on file name and location as /etc/drone.config?

What format are we using? Bash? or something else?

https://unix.stackexchange.com/questions/16521/language-agnostic-configuration-file-format

mhl787156 commented 3 years ago

There is also a question for the px4-sitl container. Should it also have a similar procedure, i.e: if PX4_INSTANCE is non zero, use that as px4 sysid if PX4_INSTANCE is zero, fallback to IP-based generation (onboard run on entrypoint)

rob-clarke commented 3 years ago

Ideally the config file is something that can be parsed at the command line and put straight into an envvar. Using yq could be a nice way of using YAML but it does add another dependency. Just sourcing bash will minimise dependencies but won't be as pretty. Otherwise no real preferences.

A similar approach to SITL would be good. I'm not sure how I feel about IP-based fallback, it may be better to just explicitly require an ID. That could be part of a script that generates Kubernetes/compose configs. The main problem is that the way the SITL allocates ports starts breaking once the instance number goes above 9. This is due to how the PX4 SITL is written. If we can restrict the IP pool to 1-10 it would be ideal but that will clash with the Vicon machine on the arena network. Port remappings could be added on top at a higher layer if needed.

mhl787156 commented 3 years ago

Alright we can implement it as a bash file for now, makes things simpler

I think whatever ends up happening with MAVROS needs to be replicated so we know they are behaving identically. I feel in order to use the scaling that kubernetes gives us, having an auto created ID - through an IP address or otherwise (e.g. a core sysid_allocation server or something)

Why does the SITL port start breaking > 9? And why does that effect Vicon?

rob-clarke commented 3 years ago

First thing of note, PX4_INSTANCE is 0-indexed. As a MAVLink system ID of 0 is invalid, the SITL instance's MAVLink system ID is set to PX4_INSTANCE + 1. This needs to match the ID assigned to the associated Mavros container's MAVROS_TGT_SYSTEM. This means a SITL with PX4_INSTANCE=0 is paired with a Mavros container with MAVROS_TGT_SYSTEM=1.

In the PX4 startup script there is only enough of a gap between the standard ports for 9 instances to fit before udp_offboard_port_remote starts overlapping. It may not be too much of a problem with each SITL in its own container, as there won't be port collisions due to separate IPs. It's just something that will need to be dealt with carefully on the Mavros end, i.e. FCU_URLs beyond instance 9 will have the same port, but different IPs. We could also modify the startup to shift things onto say 14640 which should sidestep the problem.

IIRC, the Vicon machine must be 192.168.10.1 and will allocate IPs to the cameras in the 192.168.10.[2-20] (or more) range. So if instance number is derived purely from the final octet (as currently being done for the Mavros container), either IPs collide with Vicon stuff, or the SITL ports overlap. If we want to follow the proposal of reserving 1-10 and potentially some upper ranges for SITL system IDs, there will need to be a mapping between the IP octets and ranges of reserved system IDs.

Using Kubernetes StatefulSet, it is possible to get a unique integer ordinal for each replica. This would allow each SITL instance to have an arbitrary IP and have a ordinal to use for mapping onto the reserved system ID ranges. If that works nicely, I'd be tempted to remove the IP option all together so that the containers either need an explicit ID set, or require the presence of a file-based definition and will fail otherwise.

mhl787156 commented 3 years ago

Ah right that makes sense, thats an annoying subtle difference. Will need to adjust the current implementation appropriately.

Agreed, itt would be easiest to just fix the port to 14640 as default, since Mavros only ever talks over localhost to one SITL/Autopilot anyway.

Could we put the Vicon on its own network/vlan with a bridge onto the normal network? Yeah that tracks with the other IP collision concerns I had too. If we end up having to keep track of IP to SYSID mappings, we may as well do a DHCP like SYSID allocation to make everything more flexible/ cleaner.

Yes StatefulSets are promising, the issue there is combined virtual/ real flight groups. since they would be two different types of SSet deployment and would have overlapping ordinals. If we can formalise (or even somehow make flexible) the allocation ranges, we can just sitl SITL pods to generate as $(SITL_BASE_SYSID)+$(ordinal).

The other thing to formalise is in which files/containers we want to place these SYSID/port adjustments. Both base-px4 and px4-sitl containers need this auto adjustment. Ideally Starling-mavros itself shouldn't need to know if its running SITL or real.

rob-clarke commented 3 years ago

I've had a brief look at the PR, using a constant port may result in problems if people try to run things --network=host with multiple local SITL/Mavros instances. My thinking was to follow the current system of incrementing the port but to start from a higher port number that left plenty of space for additional instances. Using udp_offboard_port_remote=$((14830+px4_instance)) would deconflict the ports for instance numbers up to 250.

Vicon on its own network was a suggestion I made when setting up the new PC. I can't remember why it was rejected beyond complexity and potentially breaking existing configurations. An easier option could be to just widen the netmask that DHCP gives out and allocate from the 192.168.11.x pool.

Real vehicles should not need to use a StatefulSet as they get the ID from the config file, so colliding ordinals should not be a problem. If system IDs are assigned as ordinal + 200 (i.e. SITL_BASE_SYSID=200), that would leave room for 40 virtual vehicles before hitting the proposed reserved range for GCS systems.

mhl787156 commented 3 years ago
rob-clarke commented 3 years ago

Implementation in the droneconfig branch now uses /etc/starling/vehicle.config. See example here