popsim-consortium / demes-python

Tools for describing and manipulating demographic models.
https://popsim-consortium.github.io/demes-docs/
ISC License
18 stars 6 forks source link

Demes for other programming languages #85

Closed jeromekelleher closed 3 years ago

jeromekelleher commented 3 years ago

We'd like the demes yaml to be the standard interchange format for demographic models. We currently have the demes Python module, which will allow any Python based simulator/inference method to easily use this format. However, it's hard to make the argument that this can be a universal interchange format without at least considering how this would work for simulators/inference methods that do not have Python frontends. Writing out a demes yaml is easy, since it's just plain text. Parsing it is another story though.

Some questions:

  1. Do we imagine having (say) a C++ parser for demes at some point in the future? This would fulfil the vast majority of uses, I would imagine.
  2. Should we split the repo into a specification and a Python reference implementation or keep everything in the one repo? I.e., should we follow something like zarr, which has the zarr-specs repo and also the zarr-python repo?

If we do imagine ultimately having say, python, C++, R and Java parsers for demes, then there's definitely an argument for keeping them in separate repos.

jeromekelleher commented 3 years ago

@molpopgen - in principle, how hard would it be to write a C++ demes parser that would build an object model like we have in Python? I'd imagine there must be some reasonably good C++ yaml parsers out there?

grahamgower commented 3 years ago

If you're happy with C rather than C++, then I might be able to help. https://github.com/tlsa/libcyaml seems to support a schema, a bit like strictyaml.

Certainly, it makes sense to split into multiple repositories. But at least for now, we don't even have a spec, so maybe we can come back to that in a few months time?

molpopgen commented 3 years ago

@molpopgen - in principle, how hard would it be to write a C++ demes parser that would build an object model like we have in Python? I'd imagine there must be some reasonably good C++ yaml parsers out there?

The short answer is that I have no idea. I'm sure there are parsers out there, modulo one's license preferences.

One parsed, you still have to dealt with the underlying object structure, etc.. Not hard, but not a road I ever see myself taking.

jeromekelleher commented 3 years ago

If you're happy with C rather than C++, then I might be able to help. https://github.com/tlsa/libcyaml seems to support a schema, a bit like strictyaml.

Sure, C is even better (for broad compatibility!), and that library looks good at first glance with a liberal license.

Certainly, it makes sense to split into multiple repositories. But at least for now, we don't even have a spec, so maybe we can come back to that in a few months time?

I agree, but I think we should make the decision on whether we want to have multiple repos before we move the repo to popsim. I.e., this repo becomes popsim-consortium/demes-python so we don't have to go through the hassle of renaming it later.

molpopgen commented 3 years ago

This seems reasonable: https://github.com/jbeder/yaml-cpp

The latest release is c++11 and drops boost as a build dependency.

MIT license. Also seems to be an Ubuntu package, although maybe not the newest version.

molpopgen commented 3 years ago

If you're happy with C rather than C++, then I might be able to help. https://github.com/tlsa/libcyaml seems to support a schema, a bit like strictyaml.

Sure, C is even better (for broad compatibility!), and that library looks good at first glance with a liberal license.

Certainly, it makes sense to split into multiple repositories. But at least for now, we don't even have a spec, so maybe we can come back to that in a few months time?

I agree, but I think we should make the decision on whether we want to have multiple repos before we move the repo to popsim. I.e., this repo becomes popsim-consortium/demes-python so we don't have to go through the hassle of renaming it later.

Multiple repos makes sense to me. You can imagine a complex situation evolving: someone writes a C spec and provides their own Python and/or R bridges with new objects for each language. That's probably beyond the scope of a minimalist Python package like we're going for.

grahamgower commented 3 years ago

It sounds like we should split into two repositories when we move to popsim-consortium. The docs are fairly well separated right now, so this hopefully won't be too painful.

I think we've got a lot of useful software covered by having a Python library. Can anyone think of specific non-python programs/libaries (actively maintained) that might want to consume our yaml files? I guess there's R libraries? I'm not particularly inclined to work in R myself, but it would be helpful to have specific tools in mind.

jeromekelleher commented 3 years ago

I'm mainly thinking about simulators like SLiM and Nemo. If there was a C/C++ parser for demes, implementing support for them would be fairly straightforward (one would hope). Although in SLiM's case, I expect things would be complicated a lot by how this would interact with Eidos, and how Eidos programs would interact with the demes model.

grahamgower commented 3 years ago

https://github.com/grahamgower/demes-c It's fairly complete, in that valid input results in the same fully-resolved graph as demes-python. None of the error conditions have been tested.

jeromekelleher commented 3 years ago

Whoa! :star_struck:

molpopgen commented 3 years ago

A rust version is on my to-do list...

molpopgen commented 3 years ago

This looks great @grahamgower !

grahamgower commented 3 years ago

Closing this, because it's no longer relevant to the demes-python repository.