Open hexylena opened 2 years ago
Hey @hexylena - is anyone currently working on options for this?
We had some contact with someone in the last years who intended to, but I have not heard from them in a long time, I think it's safe to say no one is working on it.
In the short term we're implementing the bare minimum for a TRS endpoint, because that achieves useful things for us, but in the future we'd love to actually have workflows with great metadata, uploaded/updated to WFH.
thanks 😄 is there an issue or PR for the endpoint?
I'm trying to automatically parse information from the GTN repository that could be used in WorkflowHub
We have an API if it's helpful, https://training.galaxyproject.org/training-material/api/, but the documentation there isn't updated for the new TRS endpoint api I added.
When we've discussed it in the past with the Galaxy IWC / @mvdbeek the discussion trended towards:
But if you have alternative ideas that are easier to implement and don't involve managing some 60+ repositories (even automatically), I think we'd be curious to hear them
also if you need any new endpoints / datasets exposed to make it easier just give me a shout.
Hey @hexylena and also @stain 😄
Would it make sense to do the following? I might be missing something obvious.
Apologies if this approach has already been discussed and ruled out.
That sounds potentially fine to me?
@mvdbeek any opinions on this since you're heavily involved in WFs as well.
https://github.com/galaxyproject/training-material/pull/3895 will make the API a bit nicer to work with, and add enforced linting for metadata on workflows going forward:
$ curl --silent http://localhost:4002/training-material/api/topics/metagenomics/tutorials/mothur-miseq-sop/tutorial.json | jq .workflows
[
{
"workflow": "mothur-miseq-sop.ga",
"tests": false,
"url": "http://localhost:4002/training-material/topics/metagenomics/tutorials/mothur-miseq-sop/workflows/mothur-miseq-sop.ga"
}
]
i.e. you can get this directly from the tutorial page rather than having to look at the topic page (though the addition of the direct URL to the worfklow will extend to both.)
And if there are more APIs needed that would make things more convenient just let us know.
I think whatever you decide is fine. I'd probably try to steer towards a model where you can version your workflows and upload them only when tests pass, that is what https://github.com/galaxyproject/iwc/blob/main/.github/workflows/workflow_test.yml does, that should largely be reusable as is.
Fair enough @ tests, we've been pushing for those from our side as well but no one ever wants to write them for some reason. Maybe it's time to make them mandatory. (Edit: wf tests now mandatory. https://github.com/galaxyproject/training-material/pull/3895)
It should be fairly easy with https://github.com/galaxyproject/iwc/blob/main/workflows/README.md#generate-test-from-a-workflow-invocation, and since you need small test data for the tutorial anyway it shouldn't be more work to generate this.
I know! We have documentation on it in multiple places and everything, but still seems like a high bar for folks unfortunately. Not sure why. Maybe we just don't nag enough.
Looking at the workflow_test.yml, it's great to have that as reference. I fear/suspect we'll end up writing our own, as we'd like to test against EU rather than a one-off server, to avoid some of the time costs of testing workflows, and additionally benefit potentially from having a "previously run public workflow" that we can attach as a resource to a training material.
(no need to install from git, probably better if you don't)
Ah indeed, I think that was before a new release was cut, at one point, that's very outdated. Thanks! I'll get that corrected
If planemo was written in typescript we could just generate all that in the UI 😆. Or maybe we could setup a celery task that re-uses the dependency mechanism to install planemo ...
Generate the test? Ah it'd be so cool.
I keep having the exact same thought about ptdk
/ training_init
, we could replace this entire thing with a few calls to the API and generate a markdown file from that, rather than requiring server side processes. (Of course we'd need some CORS exceptions, but, it'd be worth it.)
Just following up with this again, were you still planning to work on this @supernord? Is there any support you need from our end? (it's getting mentioned in a presentation as "work in progress" so figured I'd check in)
Hey @hexylena - thanks for following up 😄 I'm trying to finish code to collect the required metadata from the GTN API as a first step, before then trying to create an RO-crate. Maybe I could get your thoughts on the approach, and if it will work, when I push the code to GitHub?
I would like to also discuss this with the WorkflowHub club team next week
Yes, absolutely, feel free to open a PR somewhere/tag me somewhere and I'll be happy to look at it!
After reading the RO-Crate training materials they added to the GTN https://training.galaxyproject.org/training-material/topics/fair/ I have to say I feel a lot more hopeful for this!
Hey @hexylena & @mvdbeek - this is where I've added the code I have so far for converting GTN metadata into RO-crates https://github.com/AustralianBioCommons/create-gtn-rocrates
Hopefully this is useful 😄
Just an issue to track planning / progress