Open keighrim opened 1 year ago
The problem with the lapps vocabulary is that it wasn't really designed for versioning from the beginning
I don't see where this statement comes from because it was designed to be versioned and it was pretty similar to how we versioned CLAMS.
Having said that, the LAPPS versioning is clearly very different from our current versioning. I agree option 3 is not a good one without independent LAPPS funding. Option 2 works for now and is also something that we have used for CLAMS. In the long run some version of option 1 is probably the way to go.
Not that many types need to be copied (at most about a dozen) so it will not explode our vocabulary. For each of the copied types we can use the similarTo
property to refer to the old LAPPS type. We do not have to do all of them at once and we can always decide to for now only do the ones we need, which could be just the NamedEntity type.
I don't see where this statement comes from because it was designed to be versioned and it was pretty similar to how we versioned CLAMS.
Hmmm, not that I remember... My statement above is coming from
To my understanding, the lapps vocab was never started with the versioning in mind. The versioning we added (circa 2017) was ad-hoc and even since it wasn't really picked up by other components in the framework. I think that was one big lesson we learned from our mistake and led us to start CLAMS with consideration of MMIF spec versioning much more seriously from the very beginning.
I agree that for now we can stay in the option 2 area, maybe until we see a couple more changes we want to add to the LAPPS vocab types. However, once we decide to move to option 1 direction, I think it'd be better off to copy the whole thing** at once, and never consider a re-usable migration process (as a piece of code or as a manual labor). Even with all the lapps types, I don't think the size of CLAMS vocab will explode.
** (regarding the back-references to the lapps URLs based on thesimilarTo
information, I think we can do something like a.k.a.
links at the top of vocab type web pages)
Suggested re-position of the lapps types into the clams tree all agreeable. For the graph structures (PhraseS
, DepS
), I don't have a better solution either. But for the coreference
, I'm thinking we can expend the coreferences into some kind of alignment between video objects (bounding boxes) and textual mentions (e.g. faces to names). It's still a very vague idea, and I'll try to polish it out into a better formed proposal.
@keighrim I see, my memory was somewhat different. But you were involved in LAPPS since 2014? That is also longer than I remember.
In any case, how good or bad the LAPPS versioning was is somewhat immaterial. The current state has versioning similar to what we used to have in CLAMS and it does make sense to assimilate LAPPS types so we have consistent versioning across the types we are using. I have no strong feelings on whether we add elements one by one or all at the same time.
But for the coreference, I'm thinking we can expend the coreferences into some kind of alignment between video objects (bounding boxes) and textual mentions (e.g. faces to names). It's still a very vague idea, and I'll try to polish it out into a better formed proposal.
Interesting, coreference and alignment do indeed have some things in common. At some point I was wondering about using some kind of a grounding mechanism where video objects and text mentions would map to the same entity in a database. That went nowhere but I am curious to hear where you are going here.
LAPPS vocab recently started showing even more signs of degradation when the website became inaccessible via modern web browsers with sane security features, due (probably) to the TLS certificate expiration.
extended discussion from https://github.com/clamsproject/app-dbpedia-spotlight-wrapper/issues/2 also some related discussion can be found in #86.
The problem with the lapps vocabulary is that it wasn't really designed for versioning from the beginning, and even we do start properly versioning it (which I don't think is doable in any viable future that suits the Mellon grant timeline, given current status of lappsgrid infrastructure, and funding situation), we don't know how properly integrate lapps at_type versions into MMIF versions (partly due to we no longer use LD-contexts)
My assessment of possible direction from here is
I think
So that leaves us option 1.
But I'd like to here more about other possible alternatives.