legumeinfo / microservices

A collection of microservices developed and maintained by the Legume Information System
https://legumeinfo.org/
Apache License 2.0
3 stars 1 forks source link

Add macro-synteny-paf microservice #624

Open svengato opened 2 months ago

svengato commented 2 months ago

This will be similar to macro-synteny-blocks, but return a PAF file (format defined here) to applications such as JBrowse.

svengato commented 2 months ago

Current experimental branches: (destined for oblivion)

macro-synteny-paf branch: My first cut at a macro-synteny-paf microservice. It calls other microservices (genes, chromosome) instead of the Redis database. It also does not look up chromosome names from the genome prefix, instead we have to specify the chromosome name format and number of chromosomes in the URL. https://github.com/legumeinfo/microservices/tree/macro-synteny-paf

pairwise-paf: This version uses macro-synteny-blocks/paf?genome1=[...]&genome2=[...] instead of a separate microservice. https://github.com/legumeinfo/microservices/tree/pairwise-paf It is very slow for some reason.

svengato commented 2 months ago

Since then we have discussed how to do it properly. My understanding of how this works:

macro-synteny-paf takes two genome prefixes (for the query and target), and looks up their chromosome names.

For each query chromosome, it calls a version (to be developed) of macro-synteny-blocks that takes the chromosome name instead of the chromosome's gene families, looks up the ordered list of gene families, and passes these and the target chromosome names to pairwise-macro-synteny-blocks which computes the target blocks and passes them back through macro-synteny-blocks to macro-synteny-paf.

Finally it assembles the PAF file, either by looking up the block positions itself, or (also to develop) we could include them in the target blocks.

svengato commented 2 months ago

Current version 17222f1 of macro-synteny-paf calls the chromosome microservice to look up the gene families, and passes these to macro-synteny-blocks. It then uses the genes microservice to look up the returned block positions. (In theory, we could move both of these functionalities into macro-synteny-blocks.)