vsimko / gama-gateway

Gama Gateway RDF Repository and GAMA data model
0 stars 0 forks source link

Empty properties #52

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Using the GAMA Repository Statistics I found several fields that the content 
partners are asking 
about still empty. What is the status of those fields? 

I suppose it is a combination of unclear databases, difficulty with 
communication etc. I really 
think there needs to be more communication between the content partners and the 
database 
adapter team to either clear uncertainties in the database translation or make 
the content 
partner expectation more realistic. 

http://research.ciant.cz/gama/devel/GamaRepository/endpoint/estimate.php

A non-complete list of missing properties:
http://gama-gateway.eu/cache/works_by_keyword   0   0
(I read about keywords being added a while ago, but they are still not in it 
seems?)

http://gama-gateway.eu/schema/manif_format      0   0
http://gama-gateway.eu/schema/manif_language    0   0
http://gama-gateway.eu/schema/manif_with_colour 0   0
http://gama-gateway.eu/schema/manif_with_sound  0   0
http://gama-gateway.eu/schema/media_type        0   0
http://gama-gateway.eu/schema/work_right        0   0

What is the status of at least these fields?

Original issue reported on code.google.com by charles....@kmt.hku.nl on 8 Jul 2009 at 2:49

GoogleCodeExporter commented 9 years ago
I think that at least manif_with_colour, manif_with_sound, manif_format should 
be
filled by the indexing engine.

Original comment by viliam.s...@gmail.com on 12 Jul 2009 at 10:31

GoogleCodeExporter commented 9 years ago
This is not as easy as it seems to be on first sight. 
First of all, I see only the manifestations that are uploaded for indexing, so 
I can
provide metadata only for these. Then:

manif_format:
This is very easy to integrate and will be contained in my next export.

manif_with_colour:
Note that there no special representation for greylevel videos as for images,
typically we deal with YUV colour space. So the only way to find out if we have 
a
video with colour would be to check for every single frame of a video file, if 
only
the Y plane is used and U and V have values close to zero (but typically these 
will
not be exactly zero, so we would need to apply some threshold here).

I could build this in, but this might also be faulty as, e.g., a black and white
video could have credits with blue font or similar. In that case the database 
of the
CP might still say this is a black and white video while the above approach 
would say
it is "with colour".

This is a bit similar to the "length" discussion from the Prague meeting, where
Juergen wanted me to use the data from CPs database where possible, and only if 
this
is not present use my length derived from the video file sometimes theres a few
seconds black in the beginning or similar, so I could generate a length of, 
e.g.,
4:34 where the CPs database says 4:30 or similar).

manif_with_sound:
Similar to manif_with_colour. I assume that typically also video files without 
sound
will still have an audio track. But I could, e.g., check loudness if this is 
below
some threshold for the complete video file. Again, similar problems as for
manif_with_colour could appear.

So let me propose the following for manif_with_colour and manif_with_sound (we
already treat the length field in the same way):
- I implement the above and add the respective fields in my database.
- Database adaptors fill these fields where possible, also overriding my values 
with
values from the source databases.
- If I find such a field filled, I will never touch it again not to override 
data
inserted by the database adaptors.

Original comment by alu...@gmail.com on 14 Jul 2009 at 9:26

GoogleCodeExporter commented 9 years ago
I think this is a good solution, but I would like to STRESS that according to 
Gaby all that data should be available 
in their database, and according to Wiel it is in the database, the fields 
color and sound in the art table. On 
manifestation level extra info can be found (mono, stereo, etc) but that is not 
often in the database, and I don't 
think there is room for that info in the repository anyway.
I really think that Andree's solution is a back-up solution!

Manif_format is the format of the manifestation that the archives have, as 
described in the database not 
necessarily (only) the one Andree is indexing. At least, that is my 
understanding from my conversation with Gaby

Finally, I think the first point you make is interesting, and we should check 
that also those manifestations that 
are not uploaded have as much metadata as possible, and are in the database to 
start with.

Original comment by charles....@kmt.hku.nl on 14 Jul 2009 at 12:18

GoogleCodeExporter commented 9 years ago
I've integrated this as stated above, but still manif_with_sound and 
manif_with_color
should definately be overriden where possible by the database adaptors.

Original comment by alu...@gmail.com on 30 Jul 2009 at 8:12

GoogleCodeExporter commented 9 years ago

Original comment by alu...@gmail.com on 17 Aug 2009 at 2:20

GoogleCodeExporter commented 9 years ago

Original comment by alu...@gmail.com on 19 Aug 2009 at 10:35