Closed thegoatherder closed 7 months ago
ya, really cool idea. Agree that tags are limited as data, and have taken us really far - (probably too far). I also like the idea for storing captured metadata, like date metadata, within reach of compromise somehow.
Imagine if we could do something in match queries with the json like:
let doc=nlp('paul, john lennon and ringo starr')
doc.match('ringo starr').payload({roles:['drummer', 'singer'], hair:'long'})
//then later...
doc.match('and {roles:'drummer'}') //or something
Been stuck, forever, on this same dilemma - where to store information about groups of words. The good news is that they are just javascript objects, and we can stick stuff anywhere.
View
objects are transient. Every method returns a new one, and would need to marshal any data payload around, with every interaction. Old views would have stale payloads. I don't think it's the right place for this.
Putting paylods in Term
objects would also be the wrong place - 'ringo' and 'star' would need dangled or duped data between them.
Open to it, just haven't got it clear yet.
@spencermountain
Just throwing some ideas around in case they offer any inspiration... far from a solution...!
What if there was some new layer like compromise/four
with a method like .commit()
that could commit a View
and store it separately in the document.
const someObj = {} // my payload
const view = nlp('See you next September').match('next #Month').commit()
view.payload(someObj)
.commit()
could hash the Term.IDs
to generate a deterministic ID for the View
on .commit()
. This would ensure that a committed View
can be later updated with new data if needed.
doc: {
commits: {
"somehash1": {
terms: [] // list of Terms
payload: {} // the payload data
}
}
}
This would allow for Terms to hold different data in different contexts. For example a match of next #Month
versus #Month
could both attach data to the Term
September, but independently. A user could then:
const payload1 = { a: 1 }
const payload2 = { a: 2 }
const doc = nlp('See you next September')
doc.match('next #Month').commit().payload(payload1)
doc.match('#Month').commit().payload(payload2)
// ... later in the app
doc.match('next #Month').payload() // Generate checksum for this match and use it to lookup payload1 data from the commit
doc.match('#Month').payload() // Generate checksum for this match and use it to lookup payload2 data from the commit
The data could also be output by the .json()
function:
doc.match('next #Month').json()
[
{
"text": "next september",
"terms": [
{
"text": "next",
"pre": "",
"post": " ",
"tags": [
"Adjective"
],
"normal": "next",
"index": [
0,
2
],
"id": "next|00700002C",
"dirty": true,
"chunk": "Noun"
},
{
"text": "september",
"pre": "",
"post": "",
"tags": [
"Date",
"Noun",
"Month"
],
"normal": "september",
"index": [
0,
3
],
"id": "september|00800003V",
"chunk": "Noun",
"dirty": true
}
],
payload: {} ***** MY PAYLOAD *****
}
]
I think, but am not sure, that this might also support your (excellent!) suggestion of a new match syntax based on payloads:
doc.match('and {roles:'drummer'}') //or something
The matcher could simply know that when it sees {roles:'drummer'}
that it has to go and find all committed views that have that data, return their term IDs and use those to complete the match like and ringo|00012ABC starr|0A11A00B
check out the compromise-payload plugin ⚡
I'm looking for some kind of method on a View which will enable me to attach some data to words, the same way that I can attach tags. Does anything like this already exist?
Here's a rough sketch:
I think this could be an extremely powerful feature for our project and I'm sure many others...!
Example Use Case:
spacetime
or some lib)