binary-array-ld / net.binary_array_ld.bald

Java library for parsing, validating and managing binary array linked data files, e.g. HDF, netCDF.
Apache License 2.0
0 stars 3 forks source link

Feature/13 coordinate vars #29

Closed simonoakesepimorphics closed 3 years ago

simonoakesepimorphics commented 3 years ago

Fixes #13

marqh commented 3 years ago

hello @simonoakesepimorphics @CMayNorris

please can we explore a little the management of array shape within this PR and its relation to the standard: http://docs.opengeospatial.org/DRAFTS/19-002.html#_netcdf_dimensions http://docs.opengeospatial.org/DRAFTS/19-002.html#_requirement_class_e_variable_to_variable_referencing

the implementation of sourceRefShape targetRefShape and reshape

please note there was an error in the specification, where sourceRefShape was written as sourceShape and targetRefShape was written as targetShape which has only just been corrected.

I don't think there is a bald:sourceReshape predicate but this still appears in chapter 6

the published vocabulary is https://www.opengis.net/def/binary-array-ld this vocabulary is considered correct, the spec document must conform to this published vocabulary

please can we check the terms and the interpretation of these shapes and the expectation on what shall appear within the graph?

given two coordinate variables of size (15) and (10) and a data variable defined as (15, 10) I think we see BALD.reshape (15, 1) and targetShape (15), and BALD.reshape(1, 10) and targetShape (10).

Should this be sourceRefShape (15, 1), targetRefShape (15, 1), and sourceRefShape (10, 1) targetRefShape (1, 10), so that the final broadcast shape can be (15,10).

Or, should this be sourceRefShape (15), targetRefShape (15, 1), and sourceRefShape (10) targetRefShape (1, 10), so that the final broadcast shape can be (15,10).

I fear that the specification is not clear enough on this and Contains actual erros compared to the vocabulary. i can only apologise for this

please may we discuss on this ticket the aim and then raise a follow up ticket to improve the specification document, which is clearly required?

simonoakesepimorphics commented 3 years ago

@marqh I'm not too opinionated about this because I don't fully understand what the source and target arrays represent or how they're intended to be used. I used this example as a guide for the current implementation but I could have misinterpreted it. I've re-read the broadcasting docs and the spec but unfortunately I'm none the wiser.

A concrete example with explanation would be helpful for me. Here is the issue I raised on this subject - 58.

marqh commented 3 years ago

thank you @simonoakesepimorphics

we'll get to work on this and get some updated examples in response to this ticket and the spec ticket

whilst we are waiting on these updates, perhaps we could focus some work On #24 : documentation pages?

simonoakesepimorphics commented 3 years ago

Sure, I can start on the docs next.

marqh commented 3 years ago

we have finally made some tangible progress with respect to the variable referencing issues in the specification. a set of changes have been merged and built into the draft, including:

There is still plenty of space for further change, and feedback would be greatly appreciated on the clarity and implementability of these aspects of the draft. We felt it would be easier if these were included in the draft, rather than presented in a PR markup change.

@simonoakesepimorphics please may you consider the updated draft and provide some initial feedback as a comment here on whether the updated text and requirements provide sufficient clarity on specifying array shape and refShape?

if there are any thoughts on further content for the specification, we would like to collate them on this issue: https://github.com/opengeospatial/netcdf-ld/issues/75

please reach out if you would like a VC session on this many thanks mark

simonoakesepimorphics commented 3 years ago

@marqh Thanks for working on this, the new material has clarified the requirements a lot. I have made some changes to this branch to use the correct property names and I think it satisfies the part of the spec describing coordinate variables in its current state (please correct me if I'm mistaken).

I think what remains to be done to satisfy the spec is supporting the case where the var to var reference is given explicitly, rather than inferred from coordinate vars as in this branch. I would like to limit the scope of this branch and PR to the coordinate variable case and create a separate issue for explicit references - does this sound OK to you? I think that separate issue will also need some further clarifications, given below.

Some feedback & questions:

NetCDF Dimensions

Requirements Class E

marqh commented 3 years ago

hi Simon

many thanks for the feedback, there's more work to be done by us on the spec.

this PR looks good and consistent, so i'll aim to merge this, and I agree with

I think what remains to be done to satisfy the spec is supporting the case where the var to var reference is given explicitly, rather than inferred from coordinate vars as in this branch. I would like to limit the scope of this branch and PR to the coordinate variable case and create a separate issue for explicit references - does this sound OK to you?

We'll get updates to the spec draft to address your feedback comments as soon as we can and update you

mark

marqh commented 3 years ago

hmm, okay, my review is sound, but there is a conflict report on "ModelAliasDefinition.kt"

@simonoakesepimorphics please can you check up on this conflict report?

simonoakesepimorphics commented 3 years ago

Merge conflicts are resolved.

marqh commented 3 years ago

many thanks @simonoakesepimorphics