obophenotype / human-phenotype-ontology

Ontology for the description of human clinical features
http://obophenotype.github.io/human-phenotype-ontology/
Other
289 stars 51 forks source link

PURLs for old versions should be maintained moving forward #2160

Closed Relequestual closed 6 years ago

Relequestual commented 7 years ago

PURLs for old versions of HPO are invalid. This creates a problem if systems want to use PURLs to direct machines to the version of HPO they are using.

I appreciate that there is some effort involved in creating all the releases for previous versions.

Does it sound reasonable to maintain releases moving forward? I don't see anything that's required to achive this beyond not deleting releases or tags from github. Please correct me if I'm wrong.

As requested per https://github.com/obophenotype/human-phenotype-ontology/issues/810

drseb commented 7 years ago

You are absolutely right. I will try to find a solution asap. Maybe Peter has some resources to do this. Thanks for this ticket

Relequestual commented 7 years ago

What exactly do you mean by resources? I don't see that achiving this requires doing anything beyond not deleting previous releases from the repo.

cmungall commented 7 years ago

@DoctorBud and @dougli1sqrd can help.

cmungall commented 7 years ago

@Relequestual - not sure what you mean

Relequestual commented 7 years ago

Maybe Peter has some resources to do this.

What's meant by "resources"? Time? Effort? Servers? I just want to understand the constriants around solving the issue. If I can suggest solutions to make the constraints not so much of an issue, then I will.

drseb commented 7 years ago

I am back from holiday now.

First some background: We have switched HPO "release systems" a lot. We started from an svn-system, moved to hudson, moved to jenkins, and now moved to github. Fact is, we do not have all old releases anymore. All we have is in the HPO archive repository.

Resources: to get from the data we have in the HPO archive, to a valid system that resolves old releases is not totally trivial IMHO.

First of all: space requirements. Github has limits on the disk quota. All files of the HPO archive repository have for this reason been gzipped (still almost 1 GB in size). However, having the gzipped files, would not work well with PURLs, as these must point to non-gzipped files.

One idea, was to have a dedicated server. I am totally under-funded - I don't have a server, I don't even get money for a new laptop since several years...

Second of all: time requirements. Who will take the time to write, verify, and monitor a converter (HPO archive -> new system)? How will PURLs be resolved? All of this takes time and I currently do not have the resources to do this. So we need a plan.

Once we start such a system, we have to maintain it for the future. Nobody is currently funded to do that (or maybe somebody a Peters institute) and I have become very cautious before I start such a system.

I would like to have input on this from @pnrobinson @cmungall @mellybelly

pnrobinson commented 7 years ago

Hi Seb,

if you would like to do this, we can put all of this onto the JAX server and give you root rights. I can also fund somebody from CS who would work with both of us to get all of the technical bits working. Let us skype about this.

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: Sebastian Köhler notifications@github.com Sent: Monday, April 10, 2017 9:31 AM To: obophenotype/human-phenotype-ontology Cc: Peter Robinson; Mention Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

I am back from holiday now.

First some background: We have switched HPO "release systems" a lot. We started from an svn-system, moved to hudson, moved to jenkins, and now moved to github. Fact is, we do not have all old releases anymore. All we have is in the HPO archive repository.

Resources: to get from the data we have in the HPO archive, to a valid system that resolves old releases is not totally trivial IMHO.

First of all: space requirements. Github has limitshttps://help.github.com/articles/what-is-my-disk-quota/ on the disk quota. All files of the HPO archive repository have for this reason been gzipped (still almost 1 GB in size). However, having the gzipped files, would not work well with PURLs, as these must point to non-gzipped files.

One idea, was to have a dedicated server. I am totally under-funded - I don't have a server, I don't even get money for a new laptop since several years...

Second of all: time requirements. Who will take the time to write, verify, and monitor a converter (HPO archive -> new system)? How will PURLs be resolved? All of this takes time and I currently do not have the resources to do this. So we need a plan.

Once we start such a system, we have to maintain it for the future. Nobody is currently funded to do that (or maybe somebody a Peters institute) and I have become very cautious before I start such a system.

I would like to have input on this from @pnrobinsonhttps://github.com/pnrobinson @cmungallhttps://github.com/cmungall @mellybellyhttps://github.com/mellybelly

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/obophenotype/human-phenotype-ontology/issues/2160#issuecomment-292949891, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPIwUduBQtJh9AoZEeOqEYIzWtv08ks5rui8-gaJpZM4M1zQS.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Relequestual commented 7 years ago

@drseb Thanks for taking the time to explain. I know it's not always as easy as people percieve it is, but I'm not the only person who's wondered.

The EBI might already be storing the previous versions (and new versions moving forward) in an accessible fashion. At the very least they keep some information on previous versions to generate history metrics. Their Ontology Lookup Service has had a lot of work done on it over the past year or so.

It looks like discussions to find a solution to this is underway, however I'm happy to speak to someone at the EBI and ask a few questions if you'd like (It's a short walk for me).

pnrobinson commented 7 years ago

Hi everybody,

should we move forward on this? Let's plan on skyping. Should I ask Eileen to help us with an appointment?

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: Peter Robinson Sent: Tuesday, April 11, 2017 12:19 PM To: obophenotype/human-phenotype-ontology; obophenotype/human-phenotype-ontology Cc: Mention Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

Hi Seb,

if you would like to do this, we can put all of this onto the JAX server and give you root rights. I can also fund somebody from CS who would work with both of us to get all of the technical bits working. Let us skype about this.

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: Sebastian Köhler notifications@github.com Sent: Monday, April 10, 2017 9:31 AM To: obophenotype/human-phenotype-ontology Cc: Peter Robinson; Mention Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

I am back from holiday now.

First some background: We have switched HPO "release systems" a lot. We started from an svn-system, moved to hudson, moved to jenkins, and now moved to github. Fact is, we do not have all old releases anymore. All we have is in the HPO archive repository.

Resources: to get from the data we have in the HPO archive, to a valid system that resolves old releases is not totally trivial IMHO.

First of all: space requirements. Github has limitshttps://help.github.com/articles/what-is-my-disk-quota/ on the disk quota. All files of the HPO archive repository have for this reason been gzipped (still almost 1 GB in size). However, having the gzipped files, would not work well with PURLs, as these must point to non-gzipped files.

One idea, was to have a dedicated server. I am totally under-funded - I don't have a server, I don't even get money for a new laptop since several years...

Second of all: time requirements. Who will take the time to write, verify, and monitor a converter (HPO archive -> new system)? How will PURLs be resolved? All of this takes time and I currently do not have the resources to do this. So we need a plan.

Once we start such a system, we have to maintain it for the future. Nobody is currently funded to do that (or maybe somebody a Peters institute) and I have become very cautious before I start such a system.

I would like to have input on this from @pnrobinsonhttps://github.com/pnrobinson @cmungallhttps://github.com/cmungall @mellybellyhttps://github.com/mellybelly

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/obophenotype/human-phenotype-ontology/issues/2160#issuecomment-292949891, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPIwUduBQtJh9AoZEeOqEYIzWtv08ks5rui8-gaJpZM4M1zQS.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

nicolevasilevsky commented 7 years ago

Sure!

-- Nicole Vasilevsky, PhD
 Research Assistant Professor Library, Oregon Health & Science University vasilevs@ohsu.edumailto:vasilevs@ohsu.edu
 503-806-6900
 skype: nicolevasilevsky

From: Peter Robinson notifications@github.com Reply-To: obophenotype/human-phenotype-ontology reply@reply.github.com Date: Monday, April 17, 2017 at 7:38 AM To: obophenotype/human-phenotype-ontology human-phenotype-ontology@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

Hi everybody,

should we move forward on this? Let's plan on skyping. Should I ask Eileen to help us with an appointment?

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: Peter Robinson Sent: Tuesday, April 11, 2017 12:19 PM To: obophenotype/human-phenotype-ontology; obophenotype/human-phenotype-ontology Cc: Mention Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

Hi Seb,

if you would like to do this, we can put all of this onto the JAX server and give you root rights. I can also fund somebody from CS who would work with both of us to get all of the technical bits working. Let us skype about this.

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: Sebastian Köhler notifications@github.com Sent: Monday, April 10, 2017 9:31 AM To: obophenotype/human-phenotype-ontology Cc: Peter Robinson; Mention Subject: Re: [obophenotype/human-phenotype-ontology] PURLs for old versions should be maintained moving forward (#2160)

I am back from holiday now.

First some background: We have switched HPO "release systems" a lot. We started from an svn-system, moved to hudson, moved to jenkins, and now moved to github. Fact is, we do not have all old releases anymore. All we have is in the HPO archive repository.

Resources: to get from the data we have in the HPO archive, to a valid system that resolves old releases is not totally trivial IMHO.

First of all: space requirements. Github has limitshttps://help.github.com/articles/what-is-my-disk-quota/ on the disk quota. All files of the HPO archive repository have for this reason been gzipped (still almost 1 GB in size). However, having the gzipped files, would not work well with PURLs, as these must point to non-gzipped files.

One idea, was to have a dedicated server. I am totally under-funded - I don't have a server, I don't even get money for a new laptop since several years...

Second of all: time requirements. Who will take the time to write, verify, and monitor a converter (HPO archive -> new system)? How will PURLs be resolved? All of this takes time and I currently do not have the resources to do this. So we need a plan.

Once we start such a system, we have to maintain it for the future. Nobody is currently funded to do that (or maybe somebody a Peters institute) and I have become very cautious before I start such a system.

I would like to have input on this from @pnrobinsonhttps://github.com/pnrobinson @cmungallhttps://github.com/cmungall @mellybellyhttps://github.com/mellybelly

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/obophenotype/human-phenotype-ontology/issues/2160#issuecomment-292949891, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPIwUduBQtJh9AoZEeOqEYIzWtv08ks5rui8-gaJpZM4M1zQS.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/obophenotype/human-phenotype-ontology/issues/2160#issuecomment-294498265, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGaSQko3gdch_KkGOf3FcSPhekheC2OFks5rw3lvgaJpZM4M1zQS.

pnrobinson commented 6 years ago

@drseb @mellybelly @cmungall can we coordinate on this? I am starting to think that this feature might be something for the precertification project?

cmungall commented 6 years ago

I think the system is currently working well, since @drseb started using the standard obo-github release process.

E.g. both of these resolve as expected

A couple of things need to be done

Update editor docs

We still need to update the editors docs to reflect what the release process is.

The current README-editors is very old. I suggest cribbing or directly using README-editors.md from the OSK (replace "foobar" with "hp").

These docs describe the standard release process:

https://github.com/INCATools/ontology-starter-kit/blob/master/template/src/ontology/README-editors.md#release-manager-notes

It's necessary to have this updated so everyone knows how this works

Pre-2017 PURLs

PURLs from pre 2017-03-09, do not resolve, as we only adopted the OSK release method then.

Unless we get funding specifically for this I suggest this is a 'wont fix'.

Plan for the file getting bigger

The hp.owl release file is 33M, github complains at 50M, and large file sizes grow the overall size of the repo. I don't think there is anything to worry about now but we should keep an eye on this.

For comparison, the mondo OWL is much bigger (due to the use of axiom annotations), so we are using osf.io, I can provide more info, but I think we are good now.

drseb commented 6 years ago

It's necessary to have this updated so everyone knows how this works

sorry for the dumb question: but why? (I think it is better to have a few key persons to be able to create releases, otherwise I am afraid this will cause chaos)

PURLs from pre 2017-03-09, do not resolve, as we only adopted the OSK release method then. Unless we get funding specifically for this I suggest this is a 'wont fix'.

This is a problem of me being no git expert, but I saved almost all old releases. I think a clever expert can create those releases retrospectively from https://github.com/Phenomics/HPO-archive. However, I am not sure how important this is.

pnrobinson commented 6 years ago

@drseb -- could you update the README if the above is OK? I am still hoping we can all do a F2F and maybe we can discuss a plan going forward there.

pnrobinson commented 6 years ago

I think this issue is resolved currently as well as possible without specific funding for this, and I am closing the issue--reopen with specific suggestions if there is a concrete action item.