nextstrain / ncov

Nextstrain build for novel coronavirus SARS-CoV-2
https://nextstrain.org/ncov
MIT License
1.35k stars 403 forks source link

Build name wildcard does not support numbers #482

Closed huddlej closed 2 years ago

huddlej commented 4 years ago

Current Behavior

@rneher points out that build names cannot currently have numbers in them or Snakemake will produce an error that is difficult to interpret. These errors are caused by the overly strict build name wildcard constraints.

Expected behavior

We should support numbers in build names, especially for cases when we want to place dates in the build name (e.g., for historical builds of specific time periods).

Possible solution

Changing the current constraints regex to build_name = r'(?:[_a-zA-Z0-9-](?!(tip-frequencies|gisaid|zh)))+' should fix the problem. I have not tested this yet though to confirm.

emmahodcroft commented 4 years ago

@huddlej I have tried this on our ncov-swiss build on a local branch and this does indeed seem to solve the issue and allow builds with number names!

huddlej commented 3 years ago

PR 522 fixed this issue for the general case of most individual builds, but it also broke the Nextstrain builds which need to have dates in their path names and not in their build names (e.g., ncov_global-2020-12-20.json).