Closed zednis closed 8 years ago
I could not locate prov:ModelRun in the "prov" namespace.
Do you mean something else?
Edit: I found and corrected my typo in the previous post. Thanks @lic10
How about create a new class gcis:ModelRunOutput
as a subclass of prov:Entity
and gcis:Dataset
. And change gcis:ModelRun
as a subclass of prov:Activity
?
That could work.
Are there any objections to the proposed solution of @xgmachina?
Seeing none, @xgmachina please prepare a pull request.
@xgmachina do you want to prepare this pull request, or should I?
@zednis @justgo129 Should I close this issue since #130 is done. Changes made are: Create a new class gcis:ModelRunOutput as a subclass of both prov:Entity and gcis:Dataset. And change gcis:ModelRun from a subclass of prov:Entity to a subclass of prov:Activity
I'm all right with it if @rewolfe is.
+1
On Mon, Aug 24, 2015 at 2:43 PM, justgo129 notifications@github.com wrote:
I'm all right with it if @rewolfe https://github.com/rewolfe is.
— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/17#issuecomment-134334157 .
Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966
Thanks all. I will close the issue now.
I think model run should not be defined as something other than a type of dataset, since this how the phrase is used in the modeling community.
@bduggan is there a term in the community for the activity of running a model? As far as I can tell one issue here is that "model run" is used to specify both the activity and the activity output and the context of the usage is required to determine in which way the term is used.
Is this your counter proposal?
gcis:ModelRun
a subclass of gcis:Dataset
prov:Activity
specific to the running of modelsOn Thursday, August 27, Stephan Zednik wrote:
@bduggan is there a term in the community for the activity of running a model?
Good question, "model run generation" or "model run creation" come to mind but I will need to ask around.
Is this your counter proposal?
gcis:ModelRun
a subclass ofgcis:Dataset
Yes, though we should check on the attributes...
- no subclass of
prov:Activity
specific to the running of models
I'm not proposing no subclass, just concerned about using "modelRun" for this purpose.
Brian
OK. I am going to reopen this issue.
@zednis and @bduggan, granted this is one of those terms where many concepts get rolled up and used as convenient, technical shorthand for a complicated process and usage varies greatly across the communities we serve. That said, I tend to think of a model run as one complete execution of some type of model.
Model type classifications vary of course, but, for this initial discussion let's use a simple scheme like analytical, numerical, observational.
Generalizing, each model run takes zero or more model inputs, completes zero or more calculations, and produces zero or more model outputs. If not null, model run inputs can be things like parameters, input datasets, messages (think processing chains) and so on. If not null, model run outputs can be things like parameters, output datasets, messages (processing chains again) and so on.
This is very general. What are your experiences and thoughts?
@aulenbac agreed that this is a case where many terms are rolled up into one and context is required for correct interpretation.
From your definition it sounds like you view model run in a manner very similar to what had previously agreed on in the ticket - with a model run being an activity.
If the term model run has too much baggage to be happily settled as either the activity or the output, perhaps we avoid using just "model run".
A new proposal:
gcis:ModelRunOutput
a subclass of prov:Entity
(does not have to be a dataset)gcis:ModelRunExecution
a subclass of prov:Activity
Here is a model run in the relational model:
Example:
https://github.com/USGCRP/gcis-sync/blob/master/yaml/model_run/a887f3b4-3d19-44ff-9fa6-b58bbe86dfa5.yaml
Schema:
https://github.com/USGCRP/gcis/blob/master/db/dist/docs/pod/table_model_run.pod
Note that it has a time range which reflects the range of the data not the time of the activty. Also, note that it may be associated with an activity. It is essentially a dataset.
Here is an activity associated with this model run:
http://data.globalchange.gov/activity/4ef1491f-nca3-cmip3-r201205-process
There are a number of model runs associated with this activity.
Actually, the model runs are inputs to this activity.
Here are four runs (note these are datasets and are called "runs") which are inputs to that activity:
https://esg.llnl.gov:8443/metadata/advancedDatasetSearch.do?d_scenario=sresb1&d_frequency=monthly&d_offset=0&d_model=ncar_pcm1
Here are attributes and metadata about each of the runs:
https://esg.llnl.gov:8443/metadata/showObject.do?id=pcmdi.ipcc4.ncar_pcm1.sresb1.run1.monthly
https://esg.llnl.gov:8443/metadata/showObject.do?id=pcmdi.ipcc4.ncar_pcm1.sresb1.run2.monthly
https://esg.llnl.gov:8443/metadata/showObject.do?id=pcmdi.ipcc4.ncar_pcm1.sresb1.run3.monthly
https://esg.llnl.gov:8443/metadata/showObject.do?id=pcmdi.ipcc4.ncar_pcm1.sresb1.run4.monthly
I think it would be difficult to find data about the start/end times of "model run execution", i.e. when these runs were created.
Brian
@zednis, good points. Avoiding "model run", would it be clearer to say "model inputs" and "model outputs" instead? I'm trying to address the need to produce and use something that is broadly applicable and understandable. We have hydrological models, economic risk models, invasive species models, epidemiological models, economic growth models, land use models, ..., as well.
@bduggan I think we are all in agreement that what the relational model calls a "model run" corresponds to the class gcis:ModelRunOutput
in the current (github master) version of the GCIS ontology. Being that we have not reached consensus within our own group as to whether "model run" is an activity or an entity, I am starting to think we should avoid using the name without additional explicit context as a class in the ontology.
I think it is fine to keep using it in the relational database in the current manner, but hopefully we will find a representation in the ontology that everyone in our group can be satisfied with.
With that said I am curious as to the group's thoughts on this recent suggestion:
gcis:ModelRunOutput
a subclass of prov:Entity
(does not have to be a dataset)gcis:ModelRunExecution
a subclass of prov:Activity
:+1:
Recent discussions have given me the impression that modeling the activity of a model run is probably not useful at this point. Running a model is a complex and distributed effort, and not a rabbit hole that is worth going down at this point.
I agree. The most important thing to capture is the model run output information. The same information as what we capture for our other (observational) datasets.
On Mon, Aug 31, 2015 at 12:36 PM, Brian Duggan notifications@github.com wrote:
Recent discussions have given me the impression that modeling the activity of a model run is probably not useful at this point. Running a model is a complex and distributed effort, and not a rabbit hole that is worth going down at this point.
— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/17#issuecomment-136423466 .
Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966
OK, are we ok with keeping gcis:ModelRunOutput
as the class to represent the model run output with superclasses gcis:Dataset
and prov:Entity
?
On Monday, August 31, Stephan Zednik wrote:
OK, are we ok with keeping
gcis:ModelRunOutput
as the class to represent the model run output with superclassesgcis:Dataset
andprov:Entity
?
Sure.
Brian
:+1:
@zednis please feel free to proceed with preparing the pull request.
This is what is currently in the ontology:
gcis:ModelRun a owl:Class ;
rdfs:label "Model Run" ;
rdfs:comment "An activity of running a model." ;
rdfs:subClassOf prov:Activity .
gcis:ModelRunOutput a owl:Class ;
rdfs:label "Model Run Output" ;
rdfs:comment "Results generated by running a model." ;
rdfs:subClassOf prov:Entity, gcis:Dataset .
Should the pull request be to rename gcis:ModelRun
to gcis:ModelRunExecution
or to simply remove it?
I will leave gcis:ModelRunOutput
as it is.
I'll defer to @rewolfe.
I vote that we drop ModelRunExecution. We can still use the more general Activity class if we decide to capture information about the specific instance of a model run execution.
On Thu, Sep 3, 2015 at 10:01 AM, justgo129 notifications@github.com wrote:
I'll defer to @rewolfe https://github.com/rewolfe.
— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/17#issuecomment-137459198 .
Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966
OK, I will submit a pull request that drops ModelRunExecution.
edit - Actually ModelRunExecution was never added as a class, it was a proposed rename of ModelRun. I think the suggestion to remove the model run activity subclass still holds so I have prepared #145 which removes the current gcis:ModelRun
class.
I believe this ticket is ready to be closed.
Thanks, @zednis. Closed #17 due to merged #145.
There was some question today on the relationship between Model, ModelRun, Dataset, and model output.
From the current GCIS ontology:
No class or property exists with the name "model output."
Right now
gcis:ModelRun
is a subclass ofprov:Entity
and the output from a 'model run' activity. We do not have any classes that specifically represent the activity of running the model. We can use an instance ofprov:Activity
in the linked data to represent the model run process.If we were to define a class to represent the model run activity it would make sense to make it a subclass of
prov:Activity
.The PROV properties
prov:generated
andprov:wasGeneratedBy
would provide the relationships between the model run activity and the output of the model run (currently typegcis:ModelRun
).In today's meeting we discussed modifying the definition of
gcis:ModelRun
to be a subclass ofgcis:Dataset
. I think this make sense and is consistent with the current definition ofgcis:Dataset
.We also discussed alternately modifying
gcis:ModelRun
to be the activity and creating a new class to represent the output of the model run. If we do this I would suggest that new class be a subclass ofgcis:Dataset
.