SignalK / specification

Signal K is a JSON-based format for storing and sharing marine data from different sources (e.g. nmea 0183, 2000, seatalk, etc)
Other
91 stars 68 forks source link

Issue with Notes #539

Open irandrews opened 5 years ago

irandrews commented 5 years ago

We have a cloud database of around 15,000 'notes' (growing by 3,500 a year). Our 6,000 members share these notes in real time using an existing App and Website. We would like to (actually have done!) make these notes available via SignalK so that the range of clients can be expanded.

We can get the system to work (see the demo) but the current notes schema has a couple of limitations for us, and, I suspect, others. Specifically:

Location. Typically our notes are about Harbours and Anchorages. The first person to make a note about somewhere picks an arbitrary GPS point somewhere in the harbour and creates a note. Other users can add their own views to this location and these are all displayed on the screen at the same time. The current schema has no way to relate notes to each other - and specifically for the client to tell the server whether the note is about an existing location, or a new one. Unfortunately the 'name' field (which we use as the port name) is not unique, and the 'position' field only has to have a minor resolution difference for it to become a new location. Currently we are overloading the uuid field to provide this info, but this is against the schema. One solution might be for each note to have a pointer (like it does for 'region') to a named 'location' - aka a Waypoint - but we don't want to overload Waypoints as these can be used for different purposes and appear differently (i.e. in a different colour) on a screen. One possibility would be to create a new class of entity: 'location' (which has a GPS reference) or another would be to add a 'type' tag to the Waypoint entity so the client could determine what it relates to. I wouldn't want to remove 'position' from the note though.

Privileges. In our world users can edit/delete their own notes, but not one created by someone else. The current schema has no way for the server to tell the client what is editable and what is not, so that the client can put the relevant buttons on the screen. I favour something like a 'readonly' tag added to the note schema.

rob42 commented 5 years ago

Re Location: You can use one of 3 options to 'locate' a note:

           "region": {
              "type": "string",
              "description": "Region related to note. A pointer to a region UUID. Alternative to position or geohash"
            },
            "position": {
              "description": "Position related to note. Alternative to region or geohash",
              "$ref": "../definitions.json#/definitions/position"
            },
            "geohash": {
              "description": "Position related to note. Alternative to region or position",
              "$ref": "../definitions.json#/definitions/geohash"
            },

A position is a specific lat/lon, and as you say, a few decimals and its a different entry. This is where a geohash is useful. See https://en.wikipedia.org/wiki/Geohash. Basically the geohash is a single string that identifies a rectangle, the more chars the smaller area. So a search for less chars is like a zoom-out. Another option is region which is a geojson polygon, and hence can outline any shape. By tagging notes with a region you can show all notes in the polygon. Downside is you have to define a region which can be laborious depending on the detail.

Both options could be auto-populated by your backend based in a position geohash or a lat/lon being within a region.

irandrews commented 5 years ago

OK, but can a geohash uniquely define a point? (not a polygon). If it can this may do. What we need to do is reliably (and undisputedly) tie all notes that are related together (in our case they are about the same generic location, but that may not be the thing that relates them). Its the same scenario as this issue. Are we adding a new comment to this issue, or creating a new one?

rob42 commented 5 years ago

Re Privileges: The artemis-server has security down to the individual key level, in your use case thats per note. But its geared around users in groups, and read/write privileges per group (with RBAC). That means you would create a group per user (like linux does), and the user would have to identify themselves when connecting. That would control the ability to read/write notes, either on a user or vessel, or group(club?) basis.

However that status is not exposed in a normal signalk message. I did some work early-on to expose it as part of the keys meta data, but it lacked a real use case so it languished. Maybe time to ressurect it.

irandrews commented 5 years ago

Our server plugin already knows whether the user can edit a note or not. All it needs is a way to tell the client. I don't really want the client having to authenticate every note with a separate server. In fact ideally I don't want the client even knowing about a user - it doesn't need to.

rob42 commented 5 years ago

A geohash is roughly like a tile in a google map. Its a rectangular shape, the name is a hash calculated from the lat/lon. The cool thing is a long hash like u4pruydqqvj is a very small rectangle and is contained in a bigger rectangle u4pruydqq, which is in u4pruy etc. A longer hash is more precise, so searching for a geohash prefix u4pruy will select all geohashes inside that rectangle.

rob42 commented 5 years ago

If you can identify the current user, then using meta you could send the current users privileges to the client, you dont need to include the notes owner etc. But as I said this will need some further investigation. See _attr about half way down https://github.com/SignalK/specification/wiki/SignalK-Data-Model

tkurki commented 5 years ago

Re: priviledges: in essence you are authenticating your local SK server as you CA account, which provides you with data, some of which is editable (your own) and other read only; and you want to make this distinction in the SK api to the client using the SK api - right?

irandrews commented 5 years ago

Correct

tkurki commented 5 years ago

I think some real world examples included in this issue would make things a lot clearer.

tkurki commented 5 years ago

The current schema has no way to relate notes to each other

If your original data is bound to GPS positions this can be represented with the current schema in two ways: (1)

(2)

But that is not very useful yet. Just having a single http path that has all the server's notes under it is only of limited use. We have just a data model - what we are lacking is an API that would allow you to

Imagine the server having data for all relevant harbours in the world - an api that allows you to retrieve all the notes or a single one by id is not very useful.

You would create new notes by POSTing to ..../notes/ and receive the uuid of the created note. Tracking what is a "new" (not written to your cloud server I assume) note is mainly responsibility of your code, unless you want to show in the UI the sync status of each "new" notes. This use case I would solve by adding a way to add generic metadata (key value pairs) to each note, so you can extend the data model to needs related to your particular use case.

irandrews commented 5 years ago

I don't have a problem with getting the notes. (The fetch stuff is a protocol not a schema artefact). I already do a bunch of what you describe above. Attached are a screenshot and the associated server output for the notes for a given position. Note: At present I'm not using the position to relate them, I'm using the (overloaded) uuid. To go back to the original issue: how is the server supposed to tell that the new note is 'related' to existing notes, and not an entirely new one? I'm not looking to introduce use case specific stuff into the schema. These concepts will be of use in different use cases. note 21817.txt

panaaj commented 5 years ago

The current schema has no way to relate notes to each other

If your original data is bound to GPS positions this can be represented with the current schema in two ways: (1)

  • creating a region for each GPS position, with the position encoded as geohash or a GeoJSON polygon representing a single point; and a uuid
  • the notes related to this particular position refer to this region

(2)

  • using the same position in all the notes

But that is not very useful yet. Just having a single http path that has all the server's notes under it is only of limited use. We have just a data model - what we are lacking is an API that would allow you to

  • given a center position + radius or a bounding box list all positions that the server has notes for
  • retrieve the notes for a position

Imagine the server having data for all relevant harbours in the world - an api that allows you to retrieve all the notes or a single one by id is not very useful.

You would create new notes by POSTing to ..../notes/ and receive the uuid of the created note. Tracking what is a "new" (not written to your cloud server I assume) note is mainly responsibility of your code, unless you want to show in the UI the sync status of each "new" notes. This use case I would solve by adding a way to add generic metadata (key value pairs) to each note, so you can extend the data model to needs related to your particular use case.

This is a timely discussion as I have been doing some work to add support for Notes and Regions to Freeboard but the thought of retrieving 15000 notes records via /resources/notes does not really appeal.

I think resources in general could benefit from a specific section in the spec to ensure implementations offer a consistent way for applications to query, create, update and delete resources. There is already an implementation discrepency around PUTting resources to the /resources/* path between node and java servers.

Also as a server may aggregate resources from multiple sources, a means of querying local resources vs remote resources would be useful.

This may require a separate issue but just highlighting that resources probably deserve a more detailed representation in the specification now.

rob42 commented 5 years ago

The artemis-server supports a subset of jsonpath queries in paths, and also does partial-completion, eg /vessels/self/nav. Its possible to request /resources/notes/*/geohash but it doesnt currently support key=value jsonpath semantics.

Would more complete jsonpath query semantics help here? See https://goessner.net/articles/JsonPath/

panaaj commented 5 years ago

I think something like this would help. Resources are persisted beyond a server restart and will most likely have a range of store types.... some which have extensive querying capabilities (databases) some with less extensive mechanisms (file system). So a well defined, capable means to query data regardless of the store type is essential.

Thinking about some use case examples... if it can service these types of queries then I'd say we would be heading in the right direction:

irandrews commented 5 years ago

My plugin holds a locally cached version of our global database. The plugin serves a geo-based subset of these notes according to what you ask for:

rob42 commented 5 years ago

To go back to the original issue: how is the server supposed to tell that the new note is 'related' to existing notes, and not an entirely new one?

Assuming we store notes as unique /resources/notes/xxxx... associating them together must be done with data inside the note, currently we have position, region, or geohash. These are all location related and have value as such, but there are other reasons to associate notes (and resources in general)

What if you want to see all the fishing exclusion zones (regions), or a related thread of notes, like a voyage, tour, or related info. This is usually handled in forums/mail etc by arbitrarily tagging the entries.

Perhaps we should add an array, tags to resources?

irandrews commented 5 years ago

I think that would work for me. I totally agree that the mechanism to relate notes together should not be tied irrevocably to an specific data attribute

rob42 commented 5 years ago

Are you up to a pull request for the schema? Adding something like:

In resources.json

"tags": {
              "description": "Arbitrary tags to relate notes together",
              "$ref": "../definitions.json#/definitions/tags"
            },

and in definitions.json

"tags": {
                  "id": "http://jsonschema.net/signalk/definitions/tags",
                  "type": "array",
                  "description": "tags...",
                  "name": "tags",
                  "items": {
                    "type": "string",
                    "description": "tag name",
                  }

and some tests?

See the existing schemas for examples

tkurki commented 5 years ago

Why tags and not key-values?

rob42 commented 5 years ago

Not sure what you mean?

If you mean using "tags.value": [..], yes it should do that. I wasnt thinking.

The idea of the tags array is to add arbitrary strings as tags, eg 'anchorage', 'fishing', 'Abel Tasman', etc I guess you could use tags.anchorage etc. Might make wildcard search easier?

irandrews commented 5 years ago

Can you give me an example of how this would work for my Use Case? I have 7500 unique locations, each of which have between 2 and 20 'related' notes. It would have to be a solution that is sufficiently generic that any client could support anything structured in a topic+comments approach.

Would it be something like:

1:
{uuid: 1,
title: "Turku Archipelago",
tags:{
      related: <some unique random id>
     }
},

2:
{uuid: 2,
title: "Turku Archipelago",
tags:{
      related: <same random id as above>
     }
},
3:
{uuid: 3,
title: "Turku Archipelago",
tags:{
      related: <same random id as above>
     }
}
rob42 commented 5 years ago

My first thought was just an array of strings, eg

1:
{uuid: 1,
title: "Turku Archipelago",
tags:[
         "<uuid of 1>",
          "Turku Archipelago",
          "general"
        ]
},

2:
{uuid: 2,
title: "Fishing in Turku Archipelago",
tags:[
         "<uuid of 1>",
          "Turku Archipelago",
          "fishing"
        ]
},
3:
{uuid: 3,
title: "Turku Archipelago anchorage",
tags:[
         "<uuid of 1>",
          "Turku Archipelago",
          "anchoring"
        ]
}

That would allow a generic search of tags array for any full or partial terms.

rob42 commented 5 years ago

You could define specific tags for related etc, but what if its related to more than 1 other note, or you have a need for a new tag name eg fuel? Also potentially complex to provide a generic UI?
But its your use case, I'm open to any options really.

Thinking more on the UI, something like a component that: 1) searches within a raduis/region/etc 2) Returns all unique tags for all notes in 1) 3) Select one or more tags 4) returns notes 5) each note has a list of tags in the sidebar, and 'View Thread' button that shows related notes based on the selected tags...

irandrews commented 5 years ago

From the server perspective I don't care much. I know which notes are related and can supply them as a list, but when the client supplies a new note, I need to know which, if any, notes the new one is related to, and I have 7500 (and growing) different combinations, so it has to be something unique for that list. I would have thought a key:value pair was a bit easier to parse, but I take the point about being generic.

irandrews commented 5 years ago

So, summarising: Creating a structure similar to the above:

{uuid: 1,
title: "Turku Archipelago",
tags:[
         "<unique uuid>",
          "Turku Archipelago",
          "general"
        ]
},

2:
{uuid: 2,
title: "Fishing in Turku Archipelago",
tags:[
         "<same unique uuid>",
          "Turku Archipelago",
          "fishing"
        ]
},
3:
{uuid: 3,
title: "Turku Archipelago anchorage",
tags:[
         "<same unique uuid>",
          "Turku Archipelago",
          "anchoring"
        ]
}

would allow the client to retrieve: a) a single note by using <uuid> b) A bunch of related notes by using ?tags=<the unique uuid above> The client would know by the presence of the tag whether there were related notes. Do we need this differentiation? The only issue is that for my use case I'd have to make sure the notes only had one tag, otherwise the client might send the server the wrong one.

tkurki commented 5 years ago

How about something like this:

{
  "resources": {
    "regions": {
      "uuidr1": {
        "title": "Tallinn Old City Marina",
        "feature": {
          "type": "Point",
          "coordinates": [59.4672, 24.7313]
        }
      }
    },
    "notes": {
      "uuidn1": {
        "region": "uuid1",
        "title": "note title here",
        "description": "Tallinn city marina was great",
        "timestamp": "2018-07-06T17:33:22.786Z",
        "author": "Teppo Kurki"
      },
      "uuidn1123434": {
        "region": "uuid1",
        "title": "Update Tallinn City Marin",
        "description": "New note for 2019, not yet saved",
        "timestamp": "2019-03-11T11:00:31.966Z",
        "author": "Ivan Andrews"
      }
    }
  }
}

Changes:

Then a way to tell per note if it is editable. Not extremely fond of a boolean readonly, but I guess we are talking about a boolean here.

This is what I meant with extensible key-values a k a properties:

{
  "resources": {
    "regions": {
      "uuidr1": {
        "title": "Tallinn Old City Marina",
        "feature": {
          "type": "Point",
          "coordinates": [59.4672, 24.7313]
        },
        "properties": {
          "CA:type": "MARINA"
        }
      }
    },
    "notes": {
      "uuidn1": {
        "region": "uuid1",
        "title": "note title here",
        "description": "Tallinn city marina was great",
        "timestamp": "2018-07-06T17:33:22.786Z",
        "properties": {
          "CA:Created": "2018-07-07T11:00:31.966Z",
          "CA:AuthorName": "Teppo Kurki",
          "CA:AuthorId": "D134"
        }
      },
      "uuidn1123434": {
        "region": "uuid1",
        "title": "Update Tallinn City Marin",
        "description": "New note for 2019, not yet saved",
        "timestamp": "2019-03-11T11:00:31.966Z",
        "properties": {
          "CA:AuthorName": "Ivan Andrews",
          "CA:AuthorId": "D7160"
        }
      }
    }
  }
}
panaaj commented 5 years ago

Just so I am clear, with the proposed changes a note will not stand alone, it will always be linked to a region? I see no problem with notes needing to be attached to a resource but maybe the should be able to be attached to any resource type. Maybe region: attribute in Note could be resource: which is a path to the associated resource (like start / end are references to Waypoints in a Route).

tkurki commented 5 years ago

No, region is optional.

panaaj commented 5 years ago

Should it then be an optional resource attribute rather than region to allow notes to be attached to other resource types? e.g.

"notes": {
      "uuidn1": {
        "resource": "resources/regions/uuid1",
        "title": "note title here",
        "description": "Tallinn city marina was great",
        "timestamp": "2018-07-06T17:33:22.786Z",
        "properties": {
          "CA:Created": "2018-07-07T11:00:31.966Z",
          "CA:AuthorName": "Teppo Kurki",
          "CA:AuthorId": "D134"
        }
      },
      ...
}