Open cerebis opened 5 months ago
A TOML based profile definition has now been implemented. The format means rather than global definitions for many simulation parameters, these can now be set at the rank at which they take effect.
Level | parameter |
---|---|
Community | spurious_rate |
Cell | abundance, trans_rate (intermolecular rate) |
Replicon | copy_number, linear, anti_rate |
In experimenting with this format, I have found that creating the initial definition is best achieved through programmatic means, then serialized to TOML. Afterwards, modification is much easier.
In TOML, "community", "cell" and "replicon" are tables, while "segment" is a simple array of strings.
Example
In Python, TOML tables are deserialised as dictionaries, while arrays become lists. Therefore, a user can go in reverse. The following is a simple community composed of 2 cells and 3 sequences.
community = {
'spurious_rate': 0.01,
'cells': [
# First cell in community -- in two pieces
{'name': 'ecoli',
'abundance': 0.6,
'trans_rate': 0.1,
'replicons': [
{'name': 'chromosome',
'copy_number': 1,
'linear': False,
'anti_rate': 0,
'segments': ['contig_1']}
]
},
# Second cell in community
{'name': 'saur',
'abundance': 0.4,
'trans_rate': 0.2,
'replicons': [
{'name': 'chromosome',
'copy_number': 1,
'linear': True,
'anti_rate': 0.3,
'segments': ['contig_2', 'contig_3']}
]
},
]
}
A larger example of the TOML profile definition
The following involves two cells, but each cell comprises two replicon definitions in various sequence fragments.
[community]
spurious_rate =0.01
[[community.cells]]
name= "ecoli"
abundance = 1
trans_rate = 0.1
[[community.cells.replicons]]
name = "chromosome"
copy_number = 1
linear = true
anti_rate = 0
segments = [ "contig_1", "contig_2",]
[[community.cells.replicons]]
name = "plasmid"
copy_number = 4
linear = false
anti_rate = 0
segments = [ "contig_3",]
[[community.cells]]
name = "bsubt"
abundance = 0.5
[[community.cells.replicons]]
name = "chromosome"
copy_number = 1
linear = true
anti_rate = 0
segments = [ "contig_4", "contig_5", "contig_6",]
[[community.cells.replicons]]
name = "plasmid"
copy_number = 1
linear = false
anti_rate = 0
segments = [ "contig_7",]
The format used to define a community is presently a simple flat table. This approach incurs a great deal of duplicated information, and a cleaner approach would be a to use JSON or TOML to define a simple object hierarchy.
The fundamental component is just the one-to-many relationship:
Additional details would become parameters at the relevant object level.
An example prototype definition using TOML