cerebis / meta-sweeper

Parametric sweep of simulated microbial communities and metagenomic sequencing.
GNU General Public License v3.0
10 stars 0 forks source link

Possible extension of profiles to include more definition. #73

Open cerebis opened 7 years ago

cerebis commented 7 years ago

There is increasing evidentiary support that more fine-grained control over optional simulation parameters is necessary to achieve the closest approximation to real data. This could be supported by adapting the profile table to include per-genome simulation options and parameters values.

  1. We have examples where genomes suffered systematic variation in non-specific cleavage.
  2. The inter-arm anti-diagonal is not apparent in some genomes
  3. Various degrees of CID intensity (including none)

The profile is currently a flat table, which in database language is not first normal form. That is, the cell column is repeated as many times as it has chromosomes. Since cells contain chromosomes, there is a natural hierarchy. To define the profile in another way, is of debatable value in the simple case we are using now.

However, if we were to begin including other runtime options, making the profile much more of a simulation definition, then it would make sense to abandon the flat table. Perhaps, in this case, a mark-up such as yaml and the use of dictionaries.

e.g. Below, we have defaults for control parameters not appearing and in the simplest case, a chromosme would not require anything beyond its name. I am still unsure that this will be appreciated by users.

cell1:
    abundance: 0.5
    - chr1: { copy_number: 1, cid: true: anti-rate: 0.1 }
    - chr2: { cid: false }
cell2:
    abundance: 0.5
    - chrA