Closed kltm closed 2 months ago
@sierra-moxon We can talk about this elsewhere, but there is no need for GORULEs to be part of ontobio, we would just need documentation at https://github.com/geneontology/go-site/tree/master/metadata/rules Additionally, we could add stanzas to the goa.yaml metadata to indicate where data is coming from. @pgaudet There can also be more user/external-facing documentation.
For the GO_REF angle, there is also geneontology/go-site#2019
@pgaudet I think everything we need fir the software side is now part of the GORULEs framework. Is there anything you'd like to have documented before we close this out?
Hi @kltm Yes, @LiNiMGI and I would like to look at this. Specifically:
Thanks, Pascale
I tried to capture the "preprocess steps" in this diagram -- feel free to use or edit or throw away (it did not take me a long time). The "green" is the go preprocess pipeline actions. diagram to edit here. If this seems useful, we can extend it past the "GAF 2.2 output" in the pink circle to capture what the main GO Pipeline is doing to transform this file (with the PAINT and noctua annotations) to GPAD 2.0 at the end.
Hi Sierra,
This looks great!
A couple of questions:
Thanks, Pascale
Second iteration (many more details) including arrows to indicate how "control" moves between the gopreprocess Makefile and the silver-issue-325-gopreprocess pipeline branch. Also included is the location of the necessary files that are used to generate the MGI GAF now, and the timing in the pipeline where it writes intermediate files back to skyhook.
White = files downloaded Green = silver-issue-325-gopreprocess pipeline Pink = Makefile in gopreprocess code repo Orange = skyhook Blue = gopreprocess logic
to edit this diagram: https://app.diagrams.net/?mode=google#G1wSPfwmIL6e58LJwbHlFS21COLr2aC1u6#%7B%22pageId%22%3A%22b7a7eaba-c6c5-6fbe-34ae-1d3a4219ac39%22%7D
@pgaudet - I have to go to HGNC in order to use the Alliance orthology file as Alliance uses HGNC ids.
OK, got it ! thanks :)
@pgaudet If anything more needs to be done here, please re-open.
We want to document the process of taking files from the upstream and producing what MGI will have previously generated. This is currently documented in great detail here: https://drive.google.com/drive/folders/17O5e3gj_fkbSv2vscEYNIzpCNLIq3fG2 (which is active notes and technical docs for this project).
However, for the long term, we'd like to encode this information in GORULEs and the like.
An initial step could be documenting this under a new GORULE that is taken care of by the new software (and can be added to the current outputs). This can be made more granular as time goes on.
Tagging @sierra-moxon @ukemi @pgaudet