COMCIFS / MultiBlock_Dictionary

Definitions describing data stored in multiple containers
1 stars 3 forks source link

Is there a way to explicitly mark that blocks belong to the same multiblock set? #15

Open vaitkus opened 1 month ago

vaitkus commented 1 month ago

If I understand correctly, currently the data blocks are considered to belong to the same multiblock set if they appear in the same conteiner (e.g. the same CIF file). However, is there a way to determine that blocks belong to a common set if the container context is not available? For example, if I run something like:

cat multiblock_1.cif multiblock_2.cif | multiblock_data_processor.sh

is there a way for the multiblock_data_processor.sh to tell when multiblock_1.cif starts and multiblock_2.cif ends?

jamesrhester commented 1 month ago

No, there is no way. By concatenating them together and providing them to the multiblock processor in that form you are stating that all these data blocks belong together. That is what is meant by the context providing the container.

You can of course check that rows with identical keys have identical values, and may discover contradictory information.

vaitkus commented 1 month ago

No, there is no way. By concatenating them together and providing them to the multiblock processor in that form you are stating that all these data blocks belong together. That is what is meant by the context providing the container.

That is a bit unfortunate since this would potentially break some of our workflows. Are there any plans to add optional identifiers that would allow to explicitly specify which blocks belong to the same data set?

jamesrhester commented 1 month ago

There are no plans as such due to the discussion here and some earlier ideas are posted here. There may be something useful there?

rowlesmr commented 3 weeks ago

See also https://github.com/COMCIFS/comcifs.github.io/blob/main/draft/block_collections.md

I like the _audit_dataset.id idea. Give each block an identifier that says which dataset it belongs to. Collate blocks that have the same id.

The hybrid approach probably works better for legacy issues. Consumers of CIFs can ignore it if they want.

On Wed, 24 Jul 2024 at 10:03, James Hester @.***> wrote:

There are no plans as such due to the discussion here https://www.iucr.org/__data/iucr/lists/ddlm-group/msg01626.html and some earlier ideas are posted here https://www.iucr.org/__data/iucr/lists/comcifs-l/msg00677.html. There may be something useful there?

— Reply to this email directly, view it on GitHub https://github.com/COMCIFS/MultiBlock_Dictionary/issues/15#issuecomment-2246720515, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADR255DBO6QNRDLOKPC7UYTZN4DQZAVCNFSM6AAAAABLADZG62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBWG4ZDANJRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>