OpenEnergyPlatform / ontology

Repository for the Open Energy Ontology (OEO)
Creative Commons Zero v1.0 Universal
106 stars 23 forks source link

The workflow strategy is not complete yet #64

Closed akleinau closed 4 years ago

akleinau commented 4 years ago

As there seem to be different opinions on workflow, especially merges, we need to discuss that in an own issue instead of the now closed pull request #58.

Requirements/demands from the Ontology hackathon meeting:

akleinau commented 4 years ago

@Bachibouzouk I got explicitly asked by @MGlauer and @gnn to let my changes undergo a review process before I merge so I‘m in conflict right now.

@akleinau - No you're not in conflict :) : I am not giving you the order to merge without asking for a review, I am suggesting we do not let a PR open for too long. Therefore suggesting you should soon ask for someone to review your PR so we can merge it and you can keep working on the issue in a subsequent PR :)

My point is that we should keep each PR as small work packages (all the changes you commit should nevertheless not be dependent of changes to come). For example, it seems to me that the commit 693b4504ead2f312cb955e913a83fd5f35c8bcb5 could perfectly be merged into devregardless of the other commits of this PR.

To recap : @MGlauer and @gnn are right that changes need review before merging and one should merge as often as possible (because all developers will modify oeo.omn and it can create merge conflicts). "As often as possible" is to be evaluated by the developer and the exact time span can vary. For small things it can be hours, for larger ones days. If longer than a week, maybe the work can be spitted in smaller independent packages?

Originally posted by @Bachibouzouk in https://github.com/OpenEnergyPlatform/ontology/pull/58#issuecomment-550227742

akleinau commented 4 years ago

I appreciate that your goal to avoid merge conflicts by committing often, but this sounds like a very dangerous strategy, because it would involve merging untested content into dev.

In the ontology world there is no such thing as "local variable" or "encapsulation". Thus, any change you make to an ontology may interact with anything else (via logical inferences). For this reason, if you make significant changes to the structure of an ontology, it is usually not possible to break that down into smaller steps. In my experience you need to finish the task, ensure consistency, and check whether the ontology is able to prove its intended consequences (aka competency questions), and then you make the pull request.

Originally posted by @fabianneuhaus in https://github.com/OpenEnergyPlatform/ontology/pull/58#issuecomment-550307597

Bachibouzouk commented 4 years ago

Very good initiative @akleinau :)

Bachibouzouk commented 4 years ago

Just to make sure my intent is not misinterpreted : I am not giving orders, I am making suggestions so we can find the best workflow to have multiple person contributing to the ontology's repo.

@fabianneuhaus (from your comment within #58 discussion's thread)

I appreciate that your goal to avoid merge conflicts by committing often, but this sounds like a very dangerous strategy, because it would involve merging untested content into dev.

Merging often does not mean merging untested content into dev, merging a branch into dev must be done via a Pull Request and be reviewed (i.e. tested) by someone else (as described in the CONTRIBUTE.md file).

For this reason, if you make significant changes to the structure of an ontology, it is usually not possible to break that down into smaller steps. In my experience you need to finish the task, ensure consistency, and check whether the ontology is able to prove its intended consequences (aka competency questions), and then you make the pull request.

I have no experience with ontology, so I don't know what is the smallest task that can be made on the ontology, this is why I used the "As often as possible" expression. The definition of "possible" must come from someone (like you) who has experience with ontology.

My goal is to ensure that no one else needs to spend another 2 hours solving 54 merge conflicts on the oeo.omn file because the changes were sitting on several feature branches for months (which is also not good for the ontology because there could be many errors arising from the conflict solving). Therefore I am trying to setup a workflow in the CONTRIBUTE.md file. This is a first draft and it would be good if you can help improve it by adding from your ontology experience (typical number of lines changes for a task (order of magnitude, i.e. it is more 100 lines or 1000 lines change for one task), typical number of days for a task (1-2 days, week or months)

One thing is sure : with the current git workflow, it is very likely to get conflicts when more that one people work on the same file. Maybe we could make sure the people do not edit the same lines of the file before starting to work on a task (i.e. a developer could say I will make changes on lines 100-1000, so that another developer could avoid making changes in this region of the file).

fabianneuhaus commented 4 years ago

I understand your frustration having to spend 2 hours of dealing with merge conflicts. We all agree that speedy reviews are desirable and will help to avoid problems. Further, I agree with the goal to break down tasks in the smallest portions possible. My point is just that if you make significant changes to the structure of the ontology, “smallest possible” is still very large. Because you cannot test your changes until you are done applying the changes systematically. And if several people work in parallel while structural changes to the ontology are made, merge conflicts are probably unavoidable.

This problem should occur less frequently as the ontology becomes more mature and stable. Because in the case of established ontologies most changes involve just adding new classes or axioms or maybe some definitions. All of these changes are relatively localised and involve only few axioms. And, thus, the reviews of these changes should be quick and easy.

By the way, in my opinion the testing should be the done by the developer before the Pull Requests.

On 7. Nov 2019, at 12:03, Pierre Francois notifications@github.com wrote:

Just to make sure my intent is not misinterpreted : I am not giving orders, I am making suggestions so we can find the best workflow to have multiple person contributing to the ontology's repo.

@fabianneuhaus https://github.com/fabianneuhaus (from your comment within #58 https://github.com/OpenEnergyPlatform/ontology/pull/58 discussion's thread)

I appreciate that your goal to avoid merge conflicts by committing often, but this sounds like a very dangerous strategy, because it would involve merging untested content into dev.

Merging often does not mean merging untested content into dev, merging a branch into dev must be done via a Pull Request and be reviewed (i.e. tested) by someone else (as described in the CONTRIBUTE.md https://github.com/OpenEnergyPlatform/ontology/blob/dev/CONTRIBUTE.md file).

For this reason, if you make significant changes to the structure of an ontology, it is usually not possible to break that down into smaller steps. In my experience you need to finish the task, ensure consistency, and check whether the ontology is able to prove its intended consequences (aka competency questions), and then you make the pull request.

I have no experience with ontology, so I don't know what is the smallest task that can be made on the ontology, this is why I used the "As often as possible" expression. The definition of "possible" must come from someone (like you) who has experience with ontology.

My goal is to ensure that no one else needs to spend another 2 hours solving 54 merge conflicts on the oeo.omn file because the changes were sitting on several feature branches for months (which is also not good for the ontology because there could be many errors arising from the conflict solving). Therefore I am trying to setup a workflow in the CONTRIBUTE.md https://github.com/OpenEnergyPlatform/ontology/blob/dev/CONTRIBUTE.md file. This is a first draft and it would be good if you can help improve it by adding from your ontology experience (typical number of lines changes for a task (order of magnitude, i.e. it is more 100 lines or 1000 lines change for one task), typical number of days for a task (1-2 days, week or months)

One thing is sure : with the current git workflow it is very likely to get conflicts when more that one people work on the same file. Maybe we could make sure the people do not edit the same lines of the file before starting to work on a task (i.e. a developer could say I will make changes on lines 100-1000, so that another developer could avoid making changes in this region of the file).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OpenEnergyPlatform/ontology/issues/64?email_source=notifications&email_token=AAWE2W4KYF4KN6GNIOYPU53QSPYXBA5CNFSM4JJXGDCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMBJXA#issuecomment-551032028, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWE2W6WCPJMVIJCUQR62L3QSPYXBANCNFSM4JJXGDCA.

stap-m commented 4 years ago

Maybe it can be helpful to distinguish workflows between a) including new terms and b) restructuring existing parts.

a) Small steps for PRs can be helpful when including new terms to the ontology. Anyway, I think it is necessary to increase the ontology slowly and sustainably. I make a first suggestion for a work flow:

b) Changing the current structure of the onology might be more difficult and not so easy to break down to small steps, as @fabianneuhaus described. I have no good suggetion for this workflow right now. We are also going to discuss this on next weeks SzenarienDB meeting.

Bachibouzouk commented 4 years ago

@fabianneuhaus - Thank you for your reply. This triggers new questions form my side :

Bachibouzouk commented 4 years ago

Maybe it can be helpful to distinguish workflows between a) including new terms and b) restructuring existing parts.

Thank you for this suggestion @stap-m :) Would you like to include a draft of this in the CONTRIBUTE.md file? Otherwise I can do it and ask you to review it. I would suggest that we include in the PR and issue name a [A] or [B] to indicate if it concerns either a) or b) workflows.

Bachibouzouk commented 4 years ago

I would also suggest that we keep the information about who knows what in an AUTHORS.rst file so that newcomers to the project (like me) get an idea of who's responsible for what.

Ontology experts:

Students currently working on the ontology:

Past contributors to the ontology:

-

fabianneuhaus commented 4 years ago

The number of papers on ontology evaluation probably exceeds the number of ontologies. Here is my own contribution to the pile: https://www.researchgate.net/publication/260834360_Toward_Ontology_Evaluation_across_the_lifecycle

One useful technique is called “Competency Questions”. Papers on this are by Michael Gruninger and Fox. Another involves looking for bad patterns. There used to be tool called Oops! for this purpose, I am not sure whether it is still around.

I don’t think it is possible to estimate duration. In my experience, the time consuming part of ontology development is never doing the changes, it is getting consensus among the people that it is the right thing to do.

On 7. Nov 2019, at 12:51, Pierre Francois notifications@github.com wrote:

@fabianneuhaus https://github.com/fabianneuhaus - Thank you for your reply. This triggers new questions form my side :

Could you provide order of magnitude estimates of the number of days, weeks, month it typically takes to restructure an ontology (I have no idea, so even a vague idea is better than no idea)?

Do you have a procedure to test an ontology after it was restructured ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OpenEnergyPlatform/ontology/issues/64?email_source=notifications&email_token=AAWE2W24OTXPPOVSPXHMHODQSP6KXA5CNFSM4JJXGDCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMFDLQ#issuecomment-551047598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWE2W62XAURIGJLTZQ6HDTQSP6KXANCNFSM4JJXGDCA.

Bachibouzouk commented 4 years ago

Here is my own contribution to the pile

Thank you for the references @fabianneuhaus !

I don’t think it is possible to estimate duration.

Not even a lower limit, i.e. "Based on experience, it usually takes more than X days/weeks/months"?

In my experience, the time consuming part of ontology development is never doing the changes, it is getting consensus among the people that it is the right thing to do.

This is a valuable information for ontology beginners, thank you :)

Bachibouzouk commented 4 years ago

This issue arise from @han-f comment

This is why I am a "fan" to always explain the "why" right where I ask for something. I think this also helps to avoid confusion and saves time (no need to click somewhere else and get familiar with another issue and then click back).

I would suggest that one should at least copy-paste the issue description within the PR description.

Originally posted by @Bachibouzouk in https://github.com/OpenEnergyPlatform/ontology/pull/70#issuecomment-552426191

Bachibouzouk commented 4 years ago
* The goal of this ontology (what it should help with) should be kept in mind by the collaborators (and hence must be clearly described somewhere, like in the README and/or the CONTRIBUTE)

@stap-m, @han-f - could you phrase simply what is the goal?