polifonia-project / sonar2021_demo

This repository is created for the documentation of the Polifonia demo that is going to be presented to SONAR2021
https://polifonia-project.github.io/sonar2021_demo/
2 stars 0 forks source link

ETL: data from Polifonia KG to Sonar App - Spatial Annotation #37

Closed ccolonna closed 2 years ago

ccolonna commented 2 years ago

This entities won't probably be explicitly reified in the Knowledge Graph but should be extract by means of some script or procedure. This script will impersonate the role of the spatial bot analyzing the graph and producing Annotations about song people can subscribe to, see and like in the app.

This is current Annotation interface consumed by the application:

export interface Annotation {
  id: string;
  type: string;
  songID: string;
  timestamp: number;
  relationships: Relationship[];
  description?: string;
  metadata?: AnnotationMetadata;
}
export interface Relationship {
  songID: string;
  type: string;
  score: number;
}
export interface AnnotationMetadata {
}

export interface LocationMetadata extends AnnotationMetadata {
    long?: number;
    lat?: number;
    placeName?: string
}

An example of annotation currently processed by the app: data

As suggested by @vpresutti we should also show the motivation by which two songs are linked. here a discussing on the topic and some proposed solution by @JaseMK

ccolonna commented 2 years ago

A procedure suggested by @vpresutti and discussed with @delfimpandiani to extract spatial annotation from Polifonia KG data:

Delfina:

As context, we see the SONAR Interface Application as a sort of social network where users (including artificial ones) can produce “notes” (annotations) about recordings. AI bots are the type of users that analyze the Polifonia KG and create “notes” related to specific recordings. In this way, the “note” (annotation) is an application-specific entity from a user about a recording. So far we have identified and defined only one criteria for the Spatial Bot to create notes from the Polifonia KG. The spatial bot should create annotations that involve spatial information (a Place which has Latitude and Longitude). Then we have questions that need to be answered: How do we identify the degree of spatial similarity between two recordings? Is this going to be modeled in the Polifonia KG? These annotations are going to be created “on the fly” by the bot (not modeled in the Polifonia KG). Questions regarding the annotations: Should the spatial annotation relate 2 or more songs? Or is an annotation related to only 1 song acceptable? What are the requirements that need to be met for a spatial annotation to be reified? I.e., how do we identify a metric for the degree of “interestingness” for an annotation to be created? (For example, are birthplaces of artists always relevant to a recording?) Is there a number of spatial annotations each recording should have? I.e., is there a limit for the sake of the demo?

ValentinaP:

Let me try and give you some guidelines on this. All the points you raise are good ones, however we don't have time to handle them. We don't have time to define similarity metrics, ranking criteria, etc. All these aspects will be interesting to study as next steps, after the Sonar is over, as continuation of this research. For now this is what I'd do. I would find a way to index the KG data by places. Like you would do for a text base search. Take a place and index all songs that are related to it. For example, let's consider the place p1 and the songs s1, s2 and s3 that have a place-relation with property value p1. We'll have the following triples: s1 r1 p1 s2 r2 p1 s3 r3 p1 assuming that r1, r2, r3 are place-related properties e.g. birthplace of the author, recording place, song written at place. let's assume r1 rdf:label "recorded at" r2 and r3 do not have a label you create a meaningful label for r2 (author birthplace) and r3 (written at), while you can reuse r1's label you create an index: p1 -> (s1, "recorded at"), (s2, "author birthplace"), (s3, "written at") when the app is playing s1, the place agent will show something like: Musical Places related songs s2 -> author birthplace p1 s3 -> written at p1 how many of them: let's show "n" of them picked randomly. I would give the possibility to users to remove related songs to make new ones appear.

ccolonna commented 2 years ago

@phivk @JaseMK ,

for spatial annotation I don't think we will have a timestamp. In data we have information like, this song was recorded at, was played at, author birthplace is. Etc. Nothing like, at minute 1.37 of this song author talks about this place. At least for what I see until now.

What do you think we can do?

What do you think?

JaseMK commented 2 years ago

Thanks @ccolonna

I think we set the timestamp to 0. There will be several annotations, not just spatial ones, that have no time component. Then we can decide on functionality in the app for when it imports an annotation with a timestamp of 0. I like the idea of setting a random time, but perhaps some time between [songStart,30], for example. We do have song duration from the YouTube API, yes, but (especially for the demo) we probably don't want to have to wait 4 minutes to find out if a spatial annotation is going to appear, so can possibly load them toward the front half/quarter/etc of the song.

Leave this with me and I'll implement some functionality to deal with timestamp 0.

ccolonna commented 2 years ago

Ok thanks ,

I like your idea, I'll just set timestamp 0 and leave this with you!

ccolonna commented 2 years ago

Hi @phivk, at this url you can find the latest versione of the polifonia places KG. The URL shouldn't change until the repository is reorganized. When we push update we will just overwrite that file such that you can make the transformation pipeline build always to the latest version of the kg.

phivk commented 2 years ago

thanks @ccolonna , I think it will be helpful to have a "...-latest.ttl" available at that location.

However, as this file will undergo changes, it would be very helpful for me to also have stable versioned files in that directory that will not be updated. That way I have stable files I can develop against.

Could we agree on that practice?

ccolonna commented 2 years ago

For sure!

here you can find stable versioned files https://github.com/polifonia-project/sonar2021_demo/tree/datasets/polifonia_places_etl/kg/versions.

ccolonna commented 2 years ago

Close here