lfoppiano / grobid-quantities

GROBID extension for identifying and normalizing physical quantities.
https://grobid-quantities.readthedocs.io
Apache License 2.0
72 stars 24 forks source link

Expression of a resolution #43

Open everzeni opened 7 years ago

everzeni commented 7 years ago

I'm doubting about this example:

The high spatial resolution of the images (40 mas per pixel, corresponding to ≥ 100 km per pixel) resolve the inner coma, and allow investigations of the dust grain expansion velocities.

I would tend to annotate the ANGLE unit and the LENGTH unit, independantly from the "per pixel" part, (because mas/pixel or km/pixel doesn't seem to be a known unit, but I'm not sure at all about that):

- <measure type="value"><num>40</num> <measure type="ANGLE" unit="mas">mas</measure>
</measure> per pixel

- ≥ <measure type="interval"><num atLeast="100">100</num> <measure type="LENGTH" 
unit="km">km</measure></measure> per pixel

If we were to annotate mas/pixel and km/pixel as plain units, what would be their type? (create a RESOLUTION type? UNKNOWN ? DENSITY ?)

What do you think?

lfoppiano commented 7 years ago

Good question Emilia..

Your approach seems the most conservative and looks good. I would mark this part and come back to it later. I'm wondering what a pixel is... I actually would tend to annotate pixel as part of the unit and add the type resolution, although this unit looks pretty tricky as doesn't seems to have a scientific value / correspondence in inches/mm.

kermitt2 commented 7 years ago

I would annotate all the units, even the "creative" ones. It's the "generative" aspect of a scientific nomenclature.... so annotate pixel as part of the unit - and I would do it now because it would take a lot of time to come back to cases like that diluted in the training data (from my own experience :( ).

In image technical stuff, I have the feeling that pixel is quite common, for instance bits per pixel (bpp) or pixels per inch (ppi). As Luca says, I think they all correspond to density measures.

everzeni commented 7 years ago

So I annotate:

- <measure type="value"><num>40</num> <measure type="DENSITY" unit="mas/pixel">
mas per pixel </measure></measure> 

- ≥ <measure type="interval"><num atLeast="100">100</num> <measure type="DENSITY" 
unit="km/pixel">km per pixel</measure></measure>

?

kermitt2 commented 7 years ago

Yes! But I think I would use uniform notation "mas.pixel^-1" "km.pixel^-1" instead of the / @lfoppiano what do you think ?

lfoppiano commented 7 years ago

@kermitt2 I agree with uniform the annotation of the unit

Regarding the type, while discussing with @everzeni we've found that the spacial/angular resolution seems a more appropriate type for this kind of measurement involving pixels. What do you think to have such measures annotated with type SPACIAL_RESOLUTION or we could group all the resolutions with a single type: RESOLUTION.

everzeni commented 7 years ago

So I annotated:

The high spatial resolution of the images (<measure type="value"><num>40</num> <measure 
type="RESOLUTION" unit="mas.pixel^-1">mas per pixel</measure></measure>, corresponding to ≥ 
<measure type="interval"><num atLeast="100">100</num> <measure type="RESOLUTION" 
unit="km.pixel^-1">km per pixel</measure></measure>)

@kermitt2 please confirm that we can add the type RESOLUTION

lfoppiano commented 7 years ago

Since SPACIAL RESOLUTION and ANGULAR RESOLUTION are the same, yes, we add the new type RESOLUTION