freifunk-gluon / gluon

a modular framework for creating OpenWrt-based firmwares for wireless mesh nodes
https://gluon.readthedocs.io
Other
553 stars 325 forks source link

status-page does not show neighbours with BATMAN_V #1726

Open krombel opened 5 years ago

krombel commented 5 years ago

Bug report

What is the expected behaviour? Mesh partners are shown.

What is the problem? Mesh-Partners are not shown - neither for mesh-vpn nor other.

Gluon Version: v2018.2.1

Site Configuration: Site: ffmuc Domain: ffmuc_welt Link: https://github.com/freifunkMUC/site-ffm/blob/0c84212/domains/ffmuc_welt.conf

Custom patches: None

Some hints which we (@mweinelt and I) to while debugging this:

ecsv commented 5 years ago

@ambassador86 your change 582d09615bdcd9d9b6cb5ee74173ab78af3d846d is missing some follow up changes to support the throughput metric (BATADV_ATTR_THROUGHPUT) in:

Right now, all these things depend on BATADV_ATTR_TQ (which is not available with BATMAN_V). Maybe this was planned by you but I just wanted to mention it here since the author of this issue report assumes that these things should be supported.

Maybe you or @txt-file have the patches to get the support for BATADV_ATTR_THROUGHPUT somewhere. At least Zwickau told me in the past that they were using BATMAN_V with your Chemnitz-Umland firmware and it also seems to me that I can see their neighbor information on your map https://map.chemnitz.freifunk.net/#!v:m;n:6466b3bec204

ecsv commented 5 years ago

Basic steps (just take this as a rough idea - not the complete solution) which someone has to do are following:

diff --git a/package/gluon-mesh-batman-adv/src/respondd.c b/package/gluon-mesh-batman-adv/src/respondd.c
index 810dfc59..b810eb71 100644
--- a/package/gluon-mesh-batman-adv/src/respondd.c
+++ b/package/gluon-mesh-batman-adv/src/respondd.c
@@ -700,6 +700,75 @@ static int parse_orig_list_netlink_cb(struct nl_msg *msg, void *arg)
    return NL_OK;
 }

+static const enum batadv_nl_attrs parse_neigh_list_mandatory[] = {
+   BATADV_ATTR_NEIGH_ADDRESS,
+   BATADV_ATTR_THROUGHPUT,
+   BATADV_ATTR_HARD_IFINDEX,
+   BATADV_ATTR_LAST_SEEN_MSECS,
+};
+
+static int parse_neigh_list_netlink_cb(struct nl_msg *msg, void *arg)
+{
+   struct nlattr *attrs[BATADV_ATTR_MAX+1];
+   struct nlmsghdr *nlh = nlmsg_hdr(msg);
+   struct batadv_nlquery_opts *query_opts = arg;
+   struct genlmsghdr *ghdr;
+   uint32_t throughput;
+   uint8_t *neigh;
+   uint32_t hardif;
+   uint32_t lastseen;
+   char ifname_buf[IF_NAMESIZE], *ifname;
+   struct neigh_netlink_opts *opts;
+   char mac1[18];
+
+   opts = batadv_container_of(query_opts, struct neigh_netlink_opts,
+                  query_opts);
+
+   if (!genlmsg_valid_hdr(nlh, 0))
+       return NL_OK;
+
+   ghdr = nlmsg_data(nlh);
+
+   if (ghdr->cmd != BATADV_CMD_GET_NEIGHBORS)
+       return NL_OK;
+
+   if (nla_parse(attrs, BATADV_ATTR_MAX, genlmsg_attrdata(ghdr, 0),
+             genlmsg_len(ghdr), batadv_genl_policy))
+       return NL_OK;
+
+   if (batadv_genl_missing_attrs(attrs, parse_neigh_list_mandatory,
+                     BATADV_ARRAY_SIZE(parse_neigh_list_mandatory)))
+       return NL_OK;
+
+   neigh = nla_data(attrs[BATADV_ATTR_NEIGH_ADDRESS]);
+   throughput = nla_get_u32(attrs[BATADV_ATTR_TQ]);
+   hardif = nla_get_u32(attrs[BATADV_ATTR_HARD_IFINDEX]);
+   lastseen = nla_get_u32(attrs[BATADV_ATTR_LAST_SEEN_MSECS]);
+
+   ifname = if_indextoname(hardif, ifname_buf);
+   if (!ifname)
+       return NL_OK;
+
+   sprintf(mac1, "%02x:%02x:%02x:%02x:%02x:%02x",
+       neigh[0], neigh[1], neigh[2], neigh[3], neigh[4], neigh[5]);
+
+   struct json_object *obj = json_object_new_object();
+   if (!obj)
+       return NL_OK;
+
+   struct json_object *interface;
+   if (!json_object_object_get_ex(opts->interfaces, ifname, &interface)) {
+       interface = json_object_new_object();
+       json_object_object_add(opts->interfaces, ifname, interface);
+   }
+
+   json_object_object_add(obj, "throughput", json_object_new_int(throughput));
+   json_object_object_add(obj, "lastseen", json_object_new_double(lastseen / 1000.));
+   json_object_object_add(interface, mac1, obj);
+
+   return NL_OK;
+}
+
 static struct json_object * get_batadv(void) {
    struct neigh_netlink_opts opts = {
        .query_opts = {
@@ -712,9 +781,20 @@ static struct json_object * get_batadv(void) {
    if (!opts.interfaces)
        return NULL;

-   ret = batadv_genl_query("bat0", BATADV_CMD_GET_ORIGINATORS,
-               parse_orig_list_netlink_cb, NLM_F_DUMP,
-               &opts.query_opts);
+   /* TODO retrieve BATADV_ATTR_ALGO_NAME via BATADV_CMD_GET_MESH_INFO
+    * maybe just add a retrival function to libbatadv
+    */
+   if (strcmp(algo_name, "BATMAN_IV") == 0) {
+       ret = batadv_genl_query("bat0", BATADV_CMD_GET_ORIGINATORS,
+                   parse_orig_list_netlink_cb, NLM_F_DUMP,
+                   &opts.query_opts);
+   } else if (strcmp(algo_name, "BATMAN_V") == 0) {
+       ret = batadv_genl_query("bat0", BATADV_CMD_GET_NEIGHBORS,
+                   parse_neigh_list_netlink_cb, NLM_F_DUMP,
+                   &opts.query_opts);
+   } else {
+       ret = -1;
+   }
    if (ret < 0) {
        json_object_put(opts.interfaces);
        return NULL;
diff --git a/package/gluon-status-page-mesh-batman-adv/src/neighbours-batadv.c b/package/gluon-status-page-mesh-batman-adv/src/neighbours-batadv.c
index 9c2be2f7..addf2e38 100644
--- a/package/gluon-status-page-mesh-batman-adv/src/neighbours-batadv.c
+++ b/package/gluon-status-page-mesh-batman-adv/src/neighbours-batadv.c
@@ -82,6 +82,66 @@ static int parse_orig_list_netlink_cb(struct nl_msg *msg, void *arg)
   return NL_OK;
 }

+static const enum batadv_nl_attrs parse_neigh_list_mandatory[] = {
+  BATADV_ATTR_NEIGH_ADDRESS,
+  BATADV_ATTR_THROUGHPUT,
+  BATADV_ATTR_HARD_IFINDEX,
+};
+
+static int parse_neigh_list_netlink_cb(struct nl_msg *msg, void *arg)
+{
+  struct nlattr *attrs[BATADV_ATTR_MAX+1];
+  struct nlmsghdr *nlh = nlmsg_hdr(msg);
+  struct batadv_nlquery_opts *query_opts = arg;
+  struct genlmsghdr *ghdr;
+  uint32_t throughput;
+  uint8_t *neigh;
+  uint32_t hardif;
+  char ifname_buf[IF_NAMESIZE], *ifname;
+  struct neigh_netlink_opts *opts;
+  char mac1[18];
+
+  opts = batadv_container_of(query_opts, struct neigh_netlink_opts, query_opts);
+
+  if (!genlmsg_valid_hdr(nlh, 0))
+    return NL_OK;
+
+  ghdr = nlmsg_data(nlh);
+
+  if (ghdr->cmd != BATADV_CMD_GET_NEIGHBORS)
+    return NL_OK;
+
+  if (nla_parse(attrs, BATADV_ATTR_MAX, genlmsg_attrdata(ghdr, 0),
+        genlmsg_len(ghdr), batadv_genl_policy))
+    return NL_OK;
+
+  if (batadv_genl_missing_attrs(attrs, parse_neigh_list_mandatory,
+                           BATADV_ARRAY_SIZE(parse_neigh_list_mandatory)))
+    return NL_OK;
+
+  neigh = nla_data(attrs[BATADV_ATTR_NEIGH_ADDRESS]);
+  throughput = nla_get_u32(attrs[BATADV_ATTR_THROUGHPUT]);
+  hardif = nla_get_u32(attrs[BATADV_ATTR_HARD_IFINDEX]);
+
+  ifname = if_indextoname(hardif, ifname_buf);
+  if (!ifname)
+    return NL_OK;
+
+  sprintf(mac1, "%02x:%02x:%02x:%02x:%02x:%02x",
+          neigh[0], neigh[1], neigh[2], neigh[3], neigh[4], neigh[5]);
+
+  struct json_object *neigh = json_object_new_object();
+  if (!neigh)
+    return NL_OK;
+
+  json_object_object_add(neigh, "throughput", json_object_new_int(throughput));
+  json_object_object_add(neigh, "ifname", json_object_new_string(ifname));
+
+  json_object_object_add(opts->obj, mac1, neigh);
+
+  return NL_OK;
+}
+
 static json_object *neighbours(void) {
   struct neigh_netlink_opts opts = {
     .query_opts = {
@@ -94,9 +154,20 @@ static json_object *neighbours(void) {
   if (!opts.obj)
     return NULL;

-  ret = batadv_genl_query("bat0", BATADV_CMD_GET_ORIGINATORS,
-                          parse_orig_list_netlink_cb, NLM_F_DUMP,
-                          &opts.query_opts);
+  /* TODO retrieve BATADV_ATTR_ALGO_NAME via BATADV_CMD_GET_MESH_INFO
+   * maybe just add a retrival function to libbatadv
+   */
+  if (strcmp(algo_name, "BATMAN_IV") == 0) {
+    ret = batadv_genl_query("bat0", BATADV_CMD_GET_ORIGINATORS,
+                            parse_orig_list_netlink_cb, NLM_F_DUMP,
+                            &opts.query_opts);
+  } else if (strcmp(algo_name, "BATMAN_V") == 0) {
+    ret = batadv_genl_query("bat0", BATADV_CMD_GET_NEIGHBORS,
+                            parse_neigh_list_netlink_cb, NLM_F_DUMP,
+                            &opts.query_opts);
+  } else {
+    ret = -1;
+  }
   if (ret < 0) {
     json_object_put(opts.obj);
     return NULL;

The changes in gluon-radv-filterd.c are left as an exercise to the reader. You basically have to change all occurrences of tq/max_tq in the datastructures to also store larger values than uint16_t (and name the variables accordingly), find a good hysteresis function/value for the throughput metric and adjust parse_originator to support tq and throughput:

diff --git a/package/gluon-radv-filterd/src/gluon-radv-filterd.c b/package/gluon-radv-filterd/src/gluon-radv-filterd.c
index ed090c46..0e4b1fc4 100644
--- a/package/gluon-radv-filterd/src/gluon-radv-filterd.c
+++ b/package/gluon-radv-filterd/src/gluon-radv-filterd.c
@@ -462,13 +462,13 @@ static int parse_originator(struct nl_msg *msg,

    static const enum batadv_nl_attrs mandatory[] = {
        BATADV_ATTR_ORIG_ADDRESS,
-       BATADV_ATTR_TQ,
    };
    struct nlattr *attrs[BATADV_ATTR_MAX + 1];
    struct nlmsghdr *nlh = nlmsg_hdr(msg);
    struct ether_addr mac_a;
    struct genlmsghdr *ghdr;
    struct router *router;
+   uint32_t throughput;
    uint8_t *orig;
    uint8_t tq;

@@ -489,8 +489,11 @@ static int parse_originator(struct nl_msg *msg,
    if (batadv_genl_missing_attrs(attrs, mandatory, ARRAY_SIZE(mandatory)))
        return NL_OK;

+   if (!attrs[BATADV_ATTR_TQ] &&
+       !attrs[BATADV_ATTR_THROUGHPUT])
+       return NL_OK;
+
    orig = nla_data(attrs[BATADV_ATTR_ORIG_ADDRESS]);
-   tq = nla_get_u8(attrs[BATADV_ATTR_TQ]);

    if (!attrs[BATADV_ATTR_FLAG_BEST])
        return NL_OK;
@@ -502,11 +505,24 @@ static int parse_originator(struct nl_msg *msg,
    if (!router)
        return NL_OK;

-   DEBUG_MSG("Found TQ for router " F_MAC " (originator " F_MAC "), it's %d",
-         F_MAC_VAR(router->src), F_MAC_VAR(router->originator), tq);
-   router->tq = tq;
-   if (router->tq > G.max_tq)
-       G.max_tq = router->tq;
+   /* TODO make sure that router actually uses tq/throughput metric
+    * and make sure that we don't mix them - can currently not happen
+    * with batman-adv
+    */
+   if (attrs[BATADV_ATTR_TQ]) {
+       tq = nla_get_u8(attrs[BATADV_ATTR_TQ]);
+       DEBUG_MSG("Found TQ for router " F_MAC " (originator " F_MAC "), it's %d",
+           F_MAC_VAR(router->src), F_MAC_VAR(router->originator), tq);
+       router->metric = tq;
+   } else if (attrs[BATADV_ATTR_THROUGHPUT]) {
+       throughput = nla_get_u32(attrs[BATADV_ATTR_THROUGHPUT]);
+       DEBUG_MSG("Found throughput for router " F_MAC " (originator " F_MAC "), it's %d",
+           F_MAC_VAR(router->src), F_MAC_VAR(router->originator), throughput);
+       router->metric = throughput;
+   }
+
+   if (router->metric > G.max_metric)
+       G.max_metric = router->metric;

    return NL_OK;
 }

There are more changes required to get support for the throughput value in various gather daemons and frontends (status page, maps, ...)

ambassador86 commented 5 years ago

Hello @krombel @ecsv , i know that the status page support is missing. I am glad that now other communities started playing around with BATMAN_V. Our team in Zwickau hacked the status page support a little bit to get some lines on the map. Unfortunately they implemented it in a way that only supports BATMAN_V, The changes can be found in our site repo: https://gitlab.com/FreifunkChemnitz/site-ffc/tree/zwickau-devel/patches

krombel commented 5 years ago

Are those patches really recommended by you? I e.g. saw this which makes me assume that it is not fully implemented on your side either...

ambassador86 commented 5 years ago

Hey @krombel, these patches are good enough if you only have one site and all domains share the same routing algorithm. If you have mixed routing algorithms in your domains the patches will break the status page for BATMAN_IV domains.

awlx commented 5 years ago

Our experimental firmware (http://firmware.ffmuc.net/experimental/) now applies all your patches. It's looking good so far. And as most people in our chat (https://chat.ffmuc.net) confirmed they are fine to break the old setup in favour of the new ... as we want to migrate everything to BATMAN_V anyway ;).

rotanid commented 5 years ago

if anyone provides a pull request that doesn't break BATMAN_IV we are happy to take it. otherwise this won't be fixed before 2019.1 release sadly

christf commented 5 years ago

we could scrap the idea of querying the mesh protocols for their neighbours but query using multicast. This would be mesh protocol agnostic and fairly easy to do. It would also lighten the maintenance burden on the respondd modules.

However it would not mean the same thing as today. Now mesh neighbours are neighbours who are known to the mesh protocol. When implementing the above, neighbours will be known to the link layer. So there is a tradeoff.