Open mgaonach opened 2 years ago
I suppose that where it makes sense, we would like to map these values onto existing tables and columns, i.e. re-use what we already have in the schema.
But we shouldn't shy away from creating new tables and columns either, if that makes more sense.
The "Experiment/BL parameters" values could possibly go into existing and new columns in DataCollectionGroup
and/or BeamLineSetup
?
Many of the "Data collection" values map onto existing columns in the DataCollection
table with the same/similar names. The exceptions are maybe Repetition rate and Energy_bandwidth?
Sample/crystal slurry: I wonder if maybe a new table is needed for this? I suppose the Protein value should just be a proteinId FK pointing to a row in the Protein
table. What about Buffer? Is that also a pointer to a row in Protein? Or to BLSample or something else?
"Crystfel_processing", "PyFAI_parameters" and "Crystfel_merging" all sound like different steps in a processing pipeline. At Diamond, we would perhaps consider putting those values into the tables ProcessingJob
, ProcessingJobParameter
(a key-value table), AutoProcProgram
and AutoProcProgramAttachment
.
(My understanding of SSX is not great so far, so I'll have to come back to this later when I know more ...)
The idea we had was to use existing tables as much as possible when they match and then extend them with new tables dedicated to SSX when extra colums are required. The goal is to avoid adding even more columns to some already big and not very well documented tables, which would make them harder to understand and maintain.
Example: Use
DataCollection
for all the colums that are already there and createSSXDataCollection
with the new SSX-dedicated columns and a PK referencing theDataCollection
tuple for the other columns.
I think we will indeed need a new SSXSample
table, with FK to Protein
and to a new SSXBuffer
table (we still need to define what will need to be stored for this buffer entity).
At first, we will focus on the experiment, sample and data collection part of the API, and in a second step come back to the processing API.
Anyways, this is a first and very early stage version of the requirements. A lot of elements still need to be cleared up with the scientists, and we also need to get a better understanding of SSX with their help!
I think we should probably try to discuss this in a dev meeting before development starts, did the document come from ESRF internal meetings?
In previous ISPyB meetings we had discussed using DataCollection
with linker tables for discipline specific new parameters. Most of what you want is provided by DataCollection
so we should try to use this as the entry point to avoid duplicating, this also means you dont need to create entirely new resources for SSX.
There are certainly questions about sample definition and linkages there.
I remember quite a well developed prototype data model for ISPyB SSX tables, has that work now been abandoned?
We have monoBandwidth
in DiffractionPlan
we could add this to DataCollection
, it has the same meaning as energyBandwidth
.
Looks like experiment / bl parameters should be merged into DataCollection
too. DataCollection
holds the experiment parameters for this particular datacollection.
Looking more closely, almost everything already exists...
Experiment_type (chip/injector) -> DataCollectionGroup.experimentType
-> Extend enum SSX Chip
, SSX Injector
(or better yet use new ExperimentType
table from @KarlLevik and DataCollectionGroup.experimentTypeId
)
Sample support -> Should be in sample definition? Unsure
Jet_material -> Should be in sample definition? Unsure
Mono_stripe -> DataCollection
, Not sure what this is technically? Stripes are usually related to focusing mirrors, what mono is this?
Beam_Size -> DataCollection.beamSizeAtSampleX
, DataCollection.beamSizeAtSampleY
Temperature -> DataCollection.averageTemperature
Exp.time -> DataCollection.exposureTime
Transmission -> DataCollection.transmission
Flux -> DataCollection.flux
Repetition rate -> New?
Energy_bandwidth -> DataCollection.monoBandWidth
(need to copy from DiffractionPlan
)
Beam_center_x -> DataCollection.xBeam
Beam_center_y -> DataCollection.yBeam
Det_distance -> DataCollection.detectorDistance
(and ultimately DataCollection.resolution
)
When its comes to sample definition i think we should probably avoid linkages to SAXS tables (as this could be refactored, long in the future)
Protein
Buffer -> Depends on complexity of definition. We have Screen, ScreenComponent (linked to Protein) to define multi component (protein) buffers for crystallisation. Protein can represent small molecules, dna, and rna, as well as "proteins"
Avg_xtal_size -> Some of these things fit into Crystal
Xtal_concentration -> Concentration of crystals within the stream? What units?
Then potentially
DataCollection_has_SSX_Sample
as linker or DataCollection.ssxSample
?
What this ultimately means is that you dont need to create an SSX datacollection end point, you can use the existing multi-discipline /events
resource i created.
I'll come back to processing, crystalfel processing is basically integration, so it should map onto AutoProc/Integration
with some modifications. Plots can be added to AutoProcProgramAttachment
Not sure where to store the PyFAI
params, as @KarlLevik says, it would be possible to use ProcessingJobParameter
for this
Here is the interpretation I had of what we could do, excluding the processings section. We will have an internal meeting with the scientists on Monday to clear up the requirements.
Looks good, i think you can probably merge SSXExperiment
and SSXDataCollection
.
energyBandwidth
could go into DataCollection
as monoBandwidth
, as other beamlines also make use of pink and white beam data collections. E.g. id01, which i am about to deploy daiquiri on, so could be useful for Non-MX too.
Looks good, i think you can probably merge
SSXExperiment
andSSXDataCollection
.
Yes, sounds like a good idea to simplify things!
I think we can probably generalise SSXBuffer
as well using Crystal
, BLSampleTypeHasComponent
, and Protein
Heres how we can handle creating complex samples with the core tables (without needing to link to the BioSAXS tables)
Some other thoughts:
Historically some developments have created new tables / resources per discipline. This makes it hard to share features. Instead we should try to create tables that support features
. MX and other techniques can also benefit from ligands and buffers (or generic ways to describe chemical and other compositions).
We should avoid creating yet another datacollections resource. This should be handled by the events
route, we don't want to create a new dc resource for every discipline as it add large chunks of duplicated code. (every data collection list will want to filter by sampleid, time, beamline, etc). We can join the additional tables to the query, this way we have a single data collection resource that can serve multiple disciplines. It could also serve EM with minor modifications (and does at DLS).
Try to keep to naming standards. SSXSample
not specimen (specimen is a biosaxs deviation)
We dont need all the sample information with the dc, this information can be queried separately. Split queries and add link in the ui from the dc page to a sample page. We traditionally just return the sample name, and protein acronym with the dc.
Now we only have 3 missing dc and 1 missing sample column, do we need new tables? energyBandwidth
and monoStripe
are useful outside of ssx and should probably be added directly to DataCollection
SSX/DataCollection:
repetitionRate
float DEFAULT NULL
energyBandwidth
float DEFAULT NULL
monoStripe
varchar(255) DEFAULT NULL
SSX/BLSample:
jetMaterial
varchar(255) DEFAULT NULL
Outdated - jump to https://github.com/ispyb/py-ispyb/issues/192#issuecomment-1219384580
Here is the new iteration of the schema we intend to use for handling SSX data collections and samples.
Existing tables are on the left side and new tables on the right side.
The idea here is to have something generic, without trying to force everything into the existing and not always appropriate tables (i.e. the Protein
, BlSampleType_has_Component
ect tables).
New features include:
Sequence
, SequenceEvent
, SequenceEventType
: allows to save timed events that happend during the experiment (laser excitation, xray exposure, image acquisition, component mixed to sample...)Component
, CrystalComposition
, SampleComposition
: allows to describe what is composing the crystal (made prior to experiment) and the sample that was analyzed (containing the crystal and other new components added as preparation for the experiment or during the experiment to trigger reactions).SSXDataCollection
: extention of DataCollection
with new parameters specific to SSX. Here is how an SSX data collection would fit in this schema:
Each DataCollection
entity belongs to a DataCollectionGroup
. In the use case of SSX, 1 DataCollectionGroup
= 1 DataCollection
= 1 SSXDataCollection
(SSXDataCollection
is just an extention of DataCollection
with SSX-specific fields).
This DataCollectionGroup
is linked to the BLSample
that was analyzed. This BLSample is created and used only for this DataCollectionGroup
. It is made of:
Protein
table.Component
to specify what other chemicals are in the crystal composition (Buffer, ligands...). This list is defined through the CrystalComposition
linker table which describes what component is present in what concentration (=abundance
). This concentration can be expressed in different units (M, %(w/v), %(v/v), mg/ml) for different components, so we are using the ConcentrationType
lookup table as an enum to specify what unit is used (discussions about moving the link to ConcentrationType
to Component
since the same component's concentration should always be expressed with the same unit)Components
that have been added to the crystal specifically for this data collection, defined through the liked table SampleComposition
on the same principle as CrystalComposition
. Let's say we added a new Ligand durring the data collection to see what happend.Then we need to know when that new ligand was mixed with the sample so that we can figure out how long it has been in contact with the protein when each image was taken. For that we use the Sequence
. We can create a sequence named Sample preparation
and add a SequenceEvent
at the time of the mixing with SequenceEventType
= ReactionTrigger
.
This sequence could also be used for laser excitation before image acquisition but this use case is not well defined yet so should be discussed later.
SampleEvent seems to be per-image; does that mean (potentially) millions of rows per data session?
I was surprised for similar reasons that Image
is an existing table; although I note that we apparently don't use it - I can only see 137 rows on our internal instance.
Is Sequence
the best choice of table name, when "sequence" tends to have a very specific meaning in structural biology?
Are all of these 1-to-many relationships actually 1-to-many, or are some of them intended to be 1-to-1 in practice? E.g. if BLSample -> SampleComposition
is 1-to-many, I don't see a link from data collection to a row in SampleComposition
.
Are all of these 1-to-many relationships actually 1-to-many, or are some of them intended to be 1-to-1 in practice? E.g. if
BLSample -> SampleComposition
is 1-to-many, I don't see a link from data collection to a row inSampleComposition
.
Both composition tables are 1-to-many
Crystal -> CrystalComposition -> Component allows: HEPES Buffer 0.5M ph 7.5 NaCl Salt 0.75M i.e. what was in my sample before i did my experiment.
Then during an experiment you maybe add one or more ligand, additive, etc: BLSample -> SampleComposition -> Component allows: mysecretligand Ligand 0.1M Glycerol Additive 10%
You're right though, maybe a link here to DataCollection.dataCollectionId
would be sensible for context
SampleEvent seems to be per-image; does that mean (potentially) millions of rows per data session?
I was surprised for similar reasons that
Image
is an existing table; although I note that we apparently don't use it - I can only see 137 rows on our internal instance.
Yes, millions of rows for each session would be problematic. It is still a bit unclear what will be required on the processing side, and whether we need to have a record for each individual image or not. This schema also allows to save a period
and repetition, which allows to save all image acquisitions in a single record if they are aquired at a specific frequency - without using the image
table. So hopefully we can make it fit in there when the use cases become more clear.
Are all of these 1-to-many relationships actually 1-to-many, or are some of them intended to be 1-to-1 in practice? E.g. if
BLSample -> SampleComposition
is 1-to-many, I don't see a link from data collection to a row inSampleComposition
.Both composition tables are 1-to-many
Crystal -> CrystalComposition -> Component allows: HEPES Buffer 0.5M ph 7.5 NaCl Salt 0.75M i.e. what was in my sample before i did my experiment.
Then during an experiment you maybe add one or more ligand, additive, etc: BLSample -> SampleComposition -> Component allows: mysecretligand Ligand 0.1M Glycerol Additive 10%
You're right though, maybe a link here to
DataCollection.dataCollectionId
would be sensible for context
Thanks, I see - so would a single BLSample
correspond to e.g. a single fixed-target chip, but it can change during the cause of an experiment?
From my
Are all of these 1-to-many relationships actually 1-to-many, or are some of them intended to be 1-to-1 in practice? E.g. if
BLSample -> SampleComposition
is 1-to-many, I don't see a link from data collection to a row inSampleComposition
.Both composition tables are 1-to-many Crystal -> CrystalComposition -> Component allows: HEPES Buffer 0.5M ph 7.5 NaCl Salt 0.75M i.e. what was in my sample before i did my experiment. Then during an experiment you maybe add one or more ligand, additive, etc: BLSample -> SampleComposition -> Component allows: mysecretligand Ligand 0.1M Glycerol Additive 10% You're right though, maybe a link here to
DataCollection.dataCollectionId
would be sensible for contextThanks, I see - so would a single
BLSample
correspond to e.g. a single fixed-target chip, but it can change during the cause of an experiment?
From my vague understanding (@mgaonach knows more) this is more specifically for jet based mixing, rather than chips
I think the comment was that if there isn't a link to/from SampleComposition then it doesn't make sense - the only way to walk to the SampleComposition would be the one through DataCollectionGroup/Sample (in which case no metainformation to tell which one you used). So, a link from SSXCollection to sample composition would seem to make sense.
Additional queries:
This schema also allows to save a
period
and repetition, which allows to save all image acquisitions in a single record if they are aquired at a specific frequency - without using theimage
table. So hopefully we can make it fit in there when the use cases become more clear.
I'm not sure this works with the table as-is - it has a timestamp field, which implies a global fixed point in time (to match an image). I could imagine being more useful as e.g. a "Plan" - I'm not sure how often it's a simple single pulse with delay and not a structure of multiple pulses (which could be a reason for multiple SequenceEvents per collection).
we can make it fit in there when the use cases become more clear.
I'm not sure how much it's worth planning fixed-in-stone data structures if we don't know what they are supposed to be used for or what the use cases are. That seems like it will inevitably end up with something complicated to implement because it's trying to be generic, but not generic enough to be useful.
I think the comment was that if there isn't a link to/from SampleComposition then it doesn't make sense - the only way to walk to the SampleComposition would be the one through DataCollectionGroup/Sample (in which case no metainformation to tell which one you used). So, a link from SSXCollection to sample composition would seem to make sense.
I am not enterely sure to understand, but I think it comes to something unclear in the schema: the relation between DataCollectionGroup
and BLSample
is intended to be 1-to-1 in practice.
A DataCollection
is part of a DataCollectionGroup
(in principle only one DataCollection
per DataCollectionGroup
in the case of SSX). This DataCollectionGroup
is associated to a BLSample
, which is composed of a list of many Component
. So from a DataCollection
you can determine the list of Component
that were actually in what has been analyzed. Hope this clears up things.
I'm not sure this works with the table as-is - it has a timestamp field, which implies a global fixed point in time (to match an image).
In case of a single event (for instance ligand mixed) then the timestamp is the time at which this event happened. In the case of a repeated event then the timestamp is the first iteration and we can get the others with period
and repetition
. Maybe we can make this clearer by spliting it into SingleSequenceEvent
and RepeatedSequenceEvent
tables.
I'm not sure how much it's worth planning fixed-in-stone data structures if we don't know what they are supposed to be used for or what the use cases are.
This is a prototype and in no way fixed in stone. As of now we have good idea of what needs to be saved and we're iterating solutions as it gets refined. This can - and definitely will - be modified before ending up in production.
I am not enterely sure to understand, but I think it comes to something unclear in the schema: the relation between
DataCollectionGroup
andBLSample
is intended to be 1-to-1 in practice. ADataCollection
is part of aDataCollectionGroup
(in principle only oneDataCollection
perDataCollectionGroup
in the case of SSX). ThisDataCollectionGroup
is associated to aBLSample
, which is composed of a list of manyComponent
. So from aDataCollection
you can determine the list ofComponent
that were actually in what has been analyzed. Hope this clears up things.
The relationship from DCG
to BLSample
is ~1:1, but often the same ~sample~ ~crystal~ "literal thing you put in front of the beam" is shot several times (in separate groups, which DCG
:DataCollection
are mostly 1:1 for us). I'm not sure this makes sense for Serial. What I was considering for internal implementation with the current schema was; BLSample is == a single fixed target Chip, and is autogenerated for them (instead of having to manually register every single chip). The limiting step for us on that is that there isn't a good way for users to physically enter the crystal information alone (without going through the whole container/dewer workflow)
is associated to a BLSample, which is composed of a list of many Component
Now, here, a BLSample
isn't composed of a list of many Component
- it's composed of many SampleComposition
, which are themselves composed of many Component
, but also of a single (named) ConcentrationType
, which is composed of many CrystalComposition
, each of which can also be a component.
If BLSample
is supposed to be 1:1 "in practice" to data collections (which, I think I agree with, and was how I thought about the problem), then what is the meaning of multiple SampleComposition
for each BLSample
? If a "Composition" is mixed with both crystalline and non-crystalline parts, then what is ConcentrationType
for - which seems to be a mixture of only crystals?
Is the idea that, a "Component
" is essentially something mixed into a slurry (a SampleComposition
), that can contain crystals (which are also themselves a Component
) but also non-crystalline things that are also mixed into the slurry (hence linked to SampleComposition
directly).
In the case of a repeated event then the timestamp is the first iteration and we can get the others with period and repetition.
"first iteration" is still per-frame, right? A single image might have several pulses of the laser, or a light pulse and then a defined waiting time and then a substrate injection. Or the idea was that the timestamp is the start of the collection delay is set the same as 1 / fps
(in which case how do you handle varying gap times between shots as e.g. the goniometer moves to a different part of the chip).
This is a prototype and in no way fixed in stone. As of now we have good idea of what needs to be saved and we're iterating solutions as it gets refined. This can - and definitely will - be modified before ending up in production.
Okay - so, what led it to be in this shape; what problem is it solving, or what experiment has it been added for? What is it trying to represent?
Maybe this is clear to everyone else, but for me to understand, It might help to have some specific "story" examples of "A serial experiment" and specifically what constraints those have imposed on the complexity here. That might invert the problem from "What is the purpose of this table and it's constraints?" to "This complexity is required to solve problem X"
Outdated - jump to https://github.com/ispyb/py-ispyb/issues/192#issuecomment-1269629663
Here is a corrected version of the schema. I fixed the 1-to-many relations that were actually 1-to-1 and removed the missleading link to Image (this part was a very rought idea waiting for further requirements, the idea behind this prototype was focused on the sample part). As before, existing tables are on the left side and new tables on the right side.
Here is how an SSX data collection would fit in this schema:
Each DataCollection
entity belongs to a DataCollectionGroup
. In the use case of SSX, 1 DataCollectionGroup
= 1 DataCollection
= 1 SSXDataCollection
(SSXDataCollection
is just an extention of DataCollection
with SSX-specific fields).
This DataCollectionGroup
is linked to the BLSample
that was analyzed. This BLSample is created and used only for this DataCollectionGroup
. It is made of:
Protein
table.Component
to specify what other chemicals are in the crystal composition (Buffer, ligands...). This list is defined through the CrystalComposition
linker table which describes what component is present in what concentration (=abundance
). This concentration can be expressed in different units (M, %(w/v), %(v/v), mg/ml) for different components, so we are using the ConcentrationType
lookup table as an enum to specify what unit is used.Components
that have been added to the crystal specifically for this data collection, defined through the liked table SampleComposition
on the same principle as CrystalComposition
. Let's say we added a new Ligand durring the data collection to see what happend.Then we need to know when that new ligand was mixed with the sample so that we can figure out how long it has been in contact with the protein when each image was taken. For that we use the Sequence
. We can create a sequence named Sample preparation
and add a SequenceEvent
at the time of the mixing with SequenceEventType
= ReactionTrigger
.
This sequence is also used to describe data acquisition steps with a set of Xray exposure
, Xray detection
and laser excitation
with specific period and number of repetitions.
Updated previous comment with latest data model. For UI, please refer to https://github.com/ispyb/py-ispyb-ui/issues/15.
If the relationship between DataCollection
and SAXSDataCollection
is truly 1-to-1, then the PK of SAXSDataCollection
should be dataCollectionId
, and there is no need for saxsDataCollectionId
.
The same goes for the relationship between DataCollection
and SSXHits
- the PK of SSXHits
should be dataCollectionId
.
And yes, please consider using DataCollectionGroup.experimentTypeId
-> ExperimentType
table, which I think Stu mentioned. There are several advantages to using a lookup table instead of this enum since it has so many enum options now. Although, perhaps that should be its own ticket since it will have impact beyond SSX.
I'm a little bit concerned about the potential size of the GraphData
table. How many such rows do we expect per Graph
? An alternative could be to put the xy data in files and reference them through the DataCollectionFileAttachment
table (fileType = 'xy'). That said, the advantage of storing the data in the database might outweigh the inconvenience of dealing with a potentially big table.
Re: the naming of Sequence
, SequenceEvent
, SequenceEventType
: If we want shorter names, an alternative could be to remove the "Sequence" prefix in the two latter tables. You could also rename Sequence
to EventChain
if you think that makes the relationship clearer.
Re: the name of SSXHits
: The number of hits is one value stored in the table, but there are other values as well. So, perhaps we should have a name that is a bit more generic? A name hinting to the fact that the table contains SSX processing results?
If the relationship between
DataCollection
andSAXSDataCollection
is truly 1-to-1, then the PK ofSAXSDataCollection
should bedataCollectionId
, and there is no need forsaxsDataCollectionId
.
That's a very good point! I will update the model with that.
please consider
using DataCollectionGroup.experimentTypeId
->ExperimentType
table
I think this table is not in the schema we have. But I agree this enum is not very convenient. We should look into it.
I'm a little bit concerned about the potential size of the GraphData table. How many such rows do we expect per Graph? An alternative could be to put the xy data in files and reference them through the DataCollectionFileAttachment table (fileType = 'xy'). That said, the advantage of storing the data in the database might outweigh the inconvenience of dealing with a potentially big table.
Yes, I was hesitating between both of these options. It is currently used for the unit cell parameters frequency graph (which are divided in 100 bins = 100 rows per graph - I think we could reduce that amount of bins) and we have 6 parameters so 6 graphs = 600 rows per data collection. I went with database storage to see how it works but if you have any experience/input on which would be best it would be very welcome.
Re: the naming of Sequence, SequenceEvent, SequenceEventType: If we want shorter names, an alternative could be to remove the "Sequence" prefix in the two latter tables. You could also rename Sequence to EventChain if you think that makes the relationship clearer.
Sounds good to me. Getting rid of Sequence
would also be good to avoid any confusion with the biologic meaning.
Re: the name of SSXHits: The number of hits is one value stored in the table, but there are other values as well. So, perhaps we should have a name that is a bit more generic? A name hinting to the fact that the table contains SSX processing results?
Yes, I was thinking of renaming it to SSXDataCollectionProcessing
and adding another SSXExperimentProcessing
for the experiment-wide results. If we move to file graph data for the unit cell parameters, the path could also be a column of SSXDataCollectionProcessing
.
Thank you very much for your feedback!
I was also thinking SSXHits.latticeType
should be expanded out into its components cell_a, b, c, alpha, beta, gamma
. Storing this as a string makes mining hard. Might already be a table for unit cell parameters, ill have to check.
You should also link SSXDataCollectionProcessing
to AutoProcProgram
, this is the generic way to indicate a process launched / ended (gives us timing, and success stats). Then from that you can also attach graphs and files using AutoProcProgramAttachment
Update with latest schema version:
Here is how an SSX experiment would fit in this schema:
A DataCollectionGroup
is a set of DataCollection
on the same physical sample (chip, jet...). Every n
images acquired from the sample (for instance n = 10000
), we stop the data collection and start a new one.
In the use case of SSX, 1 DataCollectionGroup
= x DataCollection
= x SSXDataCollection
(SSXDataCollection
is just an extention of DataCollection
with SSX-specific fields).
This DataCollectionGroup
is linked to the BLSample
that was analyzed. This BLSample
is created and used only for this DataCollectionGroup
. It is made of:
A Crystal
, which can be re-used accross many samples, when the same crystal source is used in different samples. This crystal is itself composed of:
a Protein
, as described by the Protein
table.
a list of Component
to specify what other chemicals are in the crystal composition (Buffer, ligands...). This list is defined through the CrystalComposition linker table which describes what component is present in what concentration (=abundance). This concentration can be expressed in different units (M, %(w/v), %(v/v), mg/ml) for different components, so we are using the ConcentrationType lookup table as an enum to specify what unit is used.
Then the sample also contains some other Components
that have been added specifically for this DataCollectionGroup
, defined through the liker table SampleComposition
on the same principle as CrystalComposition
.
Let's say we added a new Ligand durring the data collection to see what happend.
Then we need to know when that new ligand was mixed with the sample so that we can figure out how long it has been in contact with the protein when each image was taken. For that we use the EventChain. We can create a EventChain named Sample preparation
and add an Event
at the time of the mixing with EventType = ReactionTrigger
.
This EventChain
is also used to describe data acquisition steps with a set of Xray exposure
, Xray detection
and laser excitation
with specific period and number of repetitions.
After each DataCollection
is complete some processing will run to determine the number of crystal hits, indexed images and statistics on unit cell parameters. This information is saved in SSXDataCollectionProcessing
with a link through AutoProcProgram
and AutoProcProgramAttachment
to the file containing the unit cell parameters information.
Thanks for the comprehensive description! One thing that is still not entirely clear from your description is whether a single EventChain
is intended to apply to every image in the data collection (this seems to me to be the sane thing to do) or should every image have its own EventChain
? If the former, then is datetime
the appropriate data type for Event.time
? Perhaps Event.offset
with data type float
would be more meaningful?
What does Event.repetition
represent in this context? How would you e.g. represent a fixed target dose series of 10 images collected per position on the fixed target chip? Would this be a single Xray detection
event with offset
corresponding to the first image, with repetition = 10
? Or 10 Xray detection
events each with their own offset
corresponding to the start of the nth image?
The way I thought about it is that a EventChain
represents what happenened during the whole data collection.
So in your example it would be more like the second solution:
10 Xray detection
events each with their own offset
= start of the nth image, repetition
= number of positions in the chip and period
= time interval between two positions. And then we could add after all of them a Move target
event to represent position change, with the same repetition
and period
values.
As for absolute time
vs offset
(relative to DataCollection.start
?), I'll see with the beamline people which seems to make more sense to them.
Might also be clearer to have something like startOffset/Time
or firstOffset/Time
instead.
SSX_ispyb.docx provides a list of the fundamental values that we would need to store for SSX experiments. This list can be used as a starting point to build a first SSX prototype in py-ISPyB.