OCR-D / spec

Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)
https://ocr-d.de/en/spec/
17 stars 5 forks source link

QA Spec - Schema #236

Closed kba closed 1 year ago

kba commented 1 year ago

Just the (still evolving) schema, so #225 can focus on the spec itself.

mweidling commented 1 year ago

processing_time as benchmark for a page is only given in the example files and not defined in the schema YAML. Since it's a bit tricky to calculate it I also wanted to know what the overall value of this is. If it doesn't have that much of a value I recommend to toss it and focus on pages_per_minute instead.

@kba What do you think?

kba commented 1 year ago

processing_time as benchmark for a page is only given in the example files and not defined in the schema YAML. Since it's a bit tricky to calculate it I also wanted to know what the overall value of this is. If it doesn't have that much of a value I recommend to toss it and focus on pages_per_minute instead.

@kba What do you think?

That was an oversight that it's not in the schema. Why is it tricky to calculate or rather, how would you calculate pages_per_minute without it?

mweidling commented 1 year ago

processing_time as benchmark for a page is only given in the example files and not defined in the schema YAML. Since it's a bit tricky to calculate it I also wanted to know what the overall value of this is. If it doesn't have that much of a value I recommend to toss it and focus on pages_per_minute instead. @kba What do you think?

That was an oversight that it's not in the schema. Why is it tricky to calculate or rather, how would you calculate pages_per_minute without it?

My first naive approach would be to look at the processing time for the complete work, calculate the mean processing time for a page and use this as basis for obtaining pages_per_minute. The arithmetic mean for a page cannot be used for processing_time, though, as this would imply the actual time this page took.

kba commented 1 year ago

processing_time as benchmark for a page is only given in the example files and not defined in the schema YAML. Since it's a bit tricky to calculate it I also wanted to know what the overall value of this is. If it doesn't have that much of a value I recommend to toss it and focus on pages_per_minute instead. @kba What do you think?

That was an oversight that it's not in the schema. Why is it tricky to calculate or rather, how would you calculate pages_per_minute without it?

My first naive approach would be to look at the processing time for the complete work, calculate the mean processing time for a page and use this as basis for obtaining pages_per_minute. The arithmetic mean for a page cannot be used for processing_time, though, as this would imply the actual time this page took.

pages_per_minute is likely the more interesting measure, so feel free to focus on that.

mweidling commented 1 year ago

processing_time as benchmark for a page is only given in the example files and not defined in the schema YAML. Since it's a bit tricky to calculate it I also wanted to know what the overall value of this is. If it doesn't have that much of a value I recommend to toss it and focus on pages_per_minute instead. @kba What do you think?

That was an oversight that it's not in the schema. Why is it tricky to calculate or rather, how would you calculate pages_per_minute without it?

My first naive approach would be to look at the processing time for the complete work, calculate the mean processing time for a page and use this as basis for obtaining pages_per_minute. The arithmetic mean for a page cannot be used for processing_time, though, as this would imply the actual time this page took.

pages_per_minute is likely the more interesting measure, so feel free to focus on that.

Let's skip processing_time for pages then for now. I'll make an issue so that we can implement it in a later stage.

kba commented 1 year ago

Sry, I accidentally rebased instead of merging, but it is all in master now.

mweidling commented 1 year ago

Sry, I accidentally rebased instead of merging, but it is all in master now.

Alright, thank you!

mweidling commented 1 year ago

I'll remove the branch then.