Open korikuzma opened 1 year ago
@reece @andreasprlic Would you be able to provide any additional information on this?
the seqrepo update procedure is closely tied to the UTA update procedure. Can we merge this issue with #6 ?
@andreasprlic That’s fine with me. I know these are two issues that are highly wanted. @reece are you ok with this?
@andreasprlic in what sense do you see these as "tied"?
They are pretty different in terms of mechanism, data sources, complexity, and reliability. With the exception of the tools that we might choose to automate the process, I don't think that lessons from one will inform the other.
So, I'd prefer to keep them separate. It's easier to compose from pieces than to disentangle a monolith.
Currently we have a manual update procedure that includes steps for both UTA and seqrepo. When I saw this ticket to "automate" future update, I was under the impression the plan might be to wrap the steps of the manual procedure as a "workflow". As such an update of both UTA and seqrepo would be done together at the same time.
Perhaps, as we build a workflow for updating the content, we could design this so each of the steps for UTA and seqrepo could get run independently and as a separate (parallel?) process. Perhaps that would make it meet what you are expecting?
Yes. Similar tooling for parallel workflows would be grand. Also, UTA really only depends on SeqRepo because it UTA needed to realign to get cigar strings way back before NCBI GFFs existed. With the GFFs, we don't need to realign.
@reece @larrybabb and I could co-lead if you are leading a separate project
Submitter Name
@reece
Submitter Affiliation
MyOme
Requested By
Everyone using SeqRepo
Lead(s)
@reece
biocommons Repo
seqrepo
Project Details
Hackathon Project Slide
SeqRepo data has not been updated since Jan 29, 2021. Instructions for updating SeqRepo is here.
The goals for this project are: