geneontology / pipeline

Declarative pipeline for the Gene Ontology.
https://build.geneontology.org/job/geneontology/job/pipeline/
BSD 3-Clause "New" or "Revised" License
5 stars 5 forks source link

GitHub push sometimes seems to fail for noctua-models #172

Open kltm opened 4 years ago

kltm commented 4 years ago

While the cached credentials for pushing the noctua-models from machine usially works, we occasionally have issues: see https://github.com/geneontology/pipeline/issues/171.

It would be good to dig in and either find an alternative (token?) or sort out what went on here.

kltm commented 4 years ago

Noting that is may be nearly impossible to track this down and the ticket may be closed either as it is old or we have moved on to a better system.

suzialeksander commented 4 years ago

Has this issue (or #171 ) popped up again? SGD snapshot Noctua file is missing all annotations after May. Looking at noctua_sgd.gpad.gz with header date 2020-06-12.

kltm commented 4 years ago

I noticed this last night while trying to get the new form out and was trying to find this ticket--thank you!

Models should be available again with the success of the next snapshot (https://build.geneontology.org/job/geneontology/job/pipeline/job/snapshot/) run.

I will continue to dig into the underlying cause.

kltm commented 3 years ago

@suzialeksander Ran into this again. It appears to exhaust in six months. Will look into switching over to different plan.

suzialeksander commented 3 years ago

So this time, all SGD annotations are missing since 22 Jan. Also of note is some annotations made JUST before the maintenance on the evening of 22 Jan (made that morning) were missing after the Noctua shutdown that weekend. Those were remade, but those and the annotations made post 22-Jan are what we are missing this time around. Not sure if there's any hidden meaning behind this timing.

kltm commented 3 years ago

Essentially, there are two place for an annotation to go "missing": 1) between the client and the server and 2) between the server and github. This ticket is dealing with number 2 and it isn't actually going "missing" missing, but rather the push action from the server to github gets gummed up. Getting that un-gummed or pushed manually clears it up; we're looking for a way of preventing this consistently in the future. For the maintenance window items that seem to have gone missing, that's a separate line of inquiry. I believe that @vanaukenk is following up on that.

dustine32 commented 3 years ago

From https://github.com/geneontology/pipeline/issues/229. Looks like another instance of push-gumming.

SGD model 60418ffa00002093.ttl (titled "Pex35 regulates peroxisome abundance") was created and marked "production" on noctua prod 2021-03-31 but not pushed to noctua-models via automated commit until 2021-04-05 https://github.com/geneontology/noctua-models/commit/50090ad4f17e5fe8b77570344e330a65f9316805. Note there were four other auto-commits between this 4/5 on and 3/31.

Tagging @suzialeksander @kltm

kltm commented 3 years ago

@dustine32 In the other case(s), I think I have pretty good handle on why it wigs out--basically the token expiring in some new way or similar. In this most recent case, that does not /seem/ to have been the case. In fact, I can find no root cause at all and the very next run seems to have resolved it without any interaction from us (the run started before we even noticed the previous one had this issue).

balhoff commented 3 years ago

@ukemi says this model has much more recent edits that have not been pushed to GitHub: https://github.com/geneontology/noctua-models/blob/master/models/60418ffa00000375.ttl

(just an additional datapoint from talking to David)

ukemi commented 3 years ago

Also, there are annotations missing. If look at the web GPAD output of the model, I see this: MGI MGI:3647519 involved_in GO:0019628 PMID:16462750 ECO:0000314 20210317 MGI has_participant(MGI:MGI:1916142),has_participant(MGI:MGI:98907),starts_with(GO:0004846),ends_with(GO:0051997),has_end_location(GO:0005777) contributor=http://orcid.org/0000-0001-7476-6306|model-state=production|noctua-model-id=gomodel:60418ffa00000375

But I don't see any annotations for MGI:3647519 to GO:0019628 in snaphot. http://snapshot.geneontology.org/products/annotations/noctua_mgi.gpad.gz

It looks like it is being filtered. If I look at the Noctua_mgi-src gpad, it is there: MGI MGI:3647519 involved_in GO:0019628 PMID:16462750 ECO:0000314 20210317 MGI has_participant(MGI:MGI:1916142),has_participant(MGI:MGI:98907),starts_with(GO:0004846),ends_with(GO:0051997),has_end_location(GO:0005777) contributor=http://orcid.org/0000-0001-7476-6306|model-state=production|noctua-model-id=gomodel:60418ffa00000375

kltm commented 3 years ago

@ukemi If any annotations exist, it should probably be a ontobio (filtering) issue or an minerva (GPAD production) issue.

@balhoff Unfortunately, it looks like that file got "saved" very soon after you reported that there was another model. @ukemi For gomodel:60418ffa00000375, what interface had you been using with it? It's weird that these end up getting saved automatically somehow right after discovering them. I'm beginning to think this might be an issue with the client code (or minerva) rather than the flush to GitHub itself (in which case we'll want to open a new issue to start putting a picture together). (see https://github.com/geneontology/noctua/issues/715)

kltm commented 3 years ago

Okay, I'm going to pull some of this information over to https://github.com/geneontology/noctua/issues/715, which seems to fit the fact pattern better. This ticket is about GitHub issues, https://github.com/geneontology/noctua/issues/715 is about the still mysterious lack of saving in some cases.