DEIB-GECO / GMQL

GMQL - GenoMetric Query Language
http://www.bioinformatics.deib.polimi.it/geco/
Apache License 2.0
18 stars 11 forks source link

Command to create synthetic annotation regions #54

Open marcomass opened 7 years ago

marcomass commented 7 years ago

Add command to automatically create “annotation regions”, i.e. regions of a given length starting from a certain location (including also the left-end of the leftmost region in a sample/dataset) and/or ending in a given location (including also the right-end of the rightmost region in a sample/dataset), with a given spacing in between (on all or only some chromosomes)

akaitoua commented 7 years ago

@marcomass, Would you please give me more detailed example to clarify this issue.

marcomass commented 7 years ago

Let's suppose you want to use some synthetic reference regions. Now, you can create them within a new sample file using an external program, then uploud the file a new dataset in the repository and finally use the new dataset as a reference dataset, e.g. in a map operation. Instead, it would be much more easy and auto-contained in the language to a have a command within the GMQL language to generate and use such synthetic region dataset. For instance something like: Synt_data = GENERATE(length: 100; spaced; 1000; on: 1, 2); to generate reference regions with length 100, spaced 1000 bp [on chromosome 1, 2 (optional)]. Some competitor languages, such as the START I pointed out to you time ago, have such builtin possibility.