Open nautolycus opened 3 months ago
Great work in compiling this table. I agree with all the suggestions.
I do not think DDLs need their own logos. That just draws attention to differences which are for most purposes irrelevant.
I think ddl filenames of the form ddl1.dic and ddlm.dic would work. We can't rename something the wwPDB have custody of - but we could make ddl2.dic resolve to mmcif_ddl.dic in those situations we have control over.
Yes, I agree that we should unify the repository names.
One more question -- should _dictionary.title
attribute values be written in all uppercase letters as is being done now? This is mostly a stylistic consideration since _dictionary.title
values are case-insensitive.
For example, the current name of the cif_core.dic
dictionary is CORE_CIF
, so should the name value be changed to cif_core
or to CIF_CORE
? I think that having it in uppercase makes sense for two reasons:
_dictionary.title
attribute value is also used as the category name of the HEAD category (see description of _name.category_id
). Having it in uppercase would be consistent with all the other category names._dictionary.title
attribute value is also used as the name of the main dictionary data block (i.e. data_CIF_CORE
). Having it in uppercase would be more consistent with the rest of the dictionary (category save frame names are written in all uppercase, data name save frames are written in all-lowercase).All uppercase works for me.
No objection.
@jamesrhester, I guess the "Change approved" label is appropriate or should this discussion be circulated elsewhere as well?
Raising an issue/pull request in the respective repositories would be sufficient, referencing this issue.
It was agreed in an email exchange between @jamesrhester, @nautolycus and @vaitkus that repositories should be named same as the dictionary title except in all lower case, e.g. the CIF_CORE
dictionary should reside in a repository called cif_core
.
There is actually one more name that would be nice to standardise -- the name of the Head save frame. This save frame is referenced by other dictionaries when importing the entire dictionary, therefore I propose to keep the name of this save frame same as the data name. That is, the cif_core.dic dictionary file would have a single CIF_CORE data block which in turn would contain the CIF_CORE save frame (see PR #490 for an example of this layout). The import statement would then look something like:
[{'dupl':Ignore 'file':cif_core.dic 'mode':Full 'save':CIF_CORE}]
Alternatively, we could call the HEAD save frame something neutral like "head", since the dictionary name would already be reflected in import statements by the dictionary filename. The import statement would then look something like:
[{'dupl':Ignore 'file':cif_core.dic 'mode':Full 'save':HEAD}]
Which one looks better?
I think the second one ('HEAD') provides more information to the casual reader, immediately showing that the whole of the imported dictionary is being enhanced.
Having the same name for the data block (dictionary) and the head save frame, while syntactically valid, seems to have already caused some issue with the external software (see https://github.com/emmo-repo/CIF-ontology/issues/194), so it would be best to avoid it.
I think the second one ('HEAD') provides more information to the casual reader, immediately showing that the whole of the imported dictionary is being enhanced.
I fully agree with these points, but after thinking about it some more, I think that we should also avoid having the same HEAD category name in all dictionaries since it might obfuscate certain errors (e.g. those dealing with import statements). Instead, I suggest that the HEAD category names be constructed by appending the _HEAD
to the dictionary name (e.g. CIF_CORE_HEAD
, CIF_PD_HEAD
). What do you think?
I think <dictionary_name>_HEAD
is an excellent idea. I had also started wondering if it was really a good idea to have the same category name (HEAD
) in many dictionaries.
I'm also happy to go with <dictionary_name>_HEAD
. In favour of the simple HEAD
proposal, that has, to my mind, the effect of establishing a formal starting point, just as '/' indicates the root of a filesystem. Importing to a new HEAD
has, in this metaphor, a similar effect as a chroot call. On the other hand, I have in the past tripped up in a chroot'd environment precisely because inspection of the root designator doesn't give you a clue that you might be in an unexpected place.
@vaitkus and I were musing over the wide variation in names and references to the various dictionaries, and note the following scatter of designators (G means version on GitHub (sometimes corresponds to DDL1/DDLm versions), P refers to the current latest versions on the wwPDB site (over which COMCIFS has no say). We feel it would be beneficial to harmonise these, in ways suggested below the table. I raise this as an issue in the core area, but it relates to all active dictionary projects. If we approve a harmonisation project, I suggest making changes initially to the electron density dictionary, which I'm currently revising and which has no active input from a wider community.
Specific suggestions:
Other comments