COMCIFS / MultiBlock_Dictionary

Definitions describing data stored in multiple containers
1 stars 3 forks source link

Minor style questions #8

Open publcif opened 7 months ago

publcif commented 7 months ago

Please forgive me if the following questions seem a bit trivial - just reviewing before the official publication of this dictionary on the IUCr site:

In _description.text would it be more appropriate to refer to 'data items' rather than 'data names' in phrases such as "The multi block dictionary adds data names to the core dictionary" ?

Also I think 'multiblock' is probably better than 'multi block' (or at least hyphenate in the above example)?

Based on previous filename conventions, would cif_core_multiblock.dic be a bit more fitting than multi_block_core.dic?

jamesrhester commented 7 months ago
  1. Re data items/ data names: I think for Volume G we are adopting the terminology 'data name' is the tag itself, 'data value' is the value associated with a tag, and the pair of name + value is a 'data item', so 'data name' is the correct terminology for an entry in a dictionary.

  2. Agree that multiblock works better. I will edit in place to fix this if nobody objects.

  3. Like the suggestion of cif_core_multiblock.dic, it will also sort next to cif_core which can't hurt. Let's see if anyone else has a comment.

vaitkus commented 7 months ago

@jamesrhester

Re data items/ data names: I think for Volume G we are adopting the terminology 'data name' is the tag itself, 'data value' is the value associated with a tag, and the pair of name + value is a 'data item', so 'data name' is the correct terminology for an entry in a dictionary.

From what I have seen, the historic usage in IUCr DDL1 dictionaries [1,2] tend to favour the term "data item" in such context, that is, "dictionaries define data items", "data items belong to a category", etc. In think this approach was taken since definitions specify not only the data names (tags), but also the constraints and semantics of the values that these tags are associated with (thus forming a complete "data item definition").

In the DDLm version of CIF_CORE both terms are used somewhat interchangeably (which might be a bit confusing, but that is an issue of its own). I must admit I always favoured "data item" in these contexts, but I do not mind too much to take the new approach as long as we all use the term consistently.

As for the "multiblock" vs "multi-block", the only reason I slightly more prefer "multi-block" is that it looks a bit better alongside the opposite term "single-block". Related CIF discussions on the web seem to use both "multiblock" and "multi-block" so it would probably be best to go with the term which looks best to a native English speaker (which I am not).

[1] https://github.com/COMCIFS/DDL1-legacy-dictionaries/blob/main/dictionaries/ddl_core.dic [2] https://github.com/COMCIFS/DDL1-legacy-dictionaries/blob/main/dictionaries/cif_core.dic

publcif commented 7 months ago

Ragarding the name of the dictionary, the IUCr are ready to register the doi as resolving to:

https://www.iucr.org/__data/iucr/cif/dictionaries/cif_core_multiblock_1.0.0.dic

and refer to it as:

Multiblock Core CIF dictionary version 1.0.0

But obviously we need to be consistent with the file name used here (file names can be used in imports for example).

So a decision on this would be welcome (not sure what the convention is when simply requesting a file name change :-)

jamesrhester commented 7 months ago

I've seen no objections so I think we should go with what the IUCr are ready to do. I will create the appropriate pull requests.