veg / hivclustering

Infer molecular transmission networks from pairwise distance files (part of HIV-TRACE)
3 stars 5 forks source link

Import Error running `hivnetworkcsv` #40

Open FynnFreyer opened 1 year ago

FynnFreyer commented 1 year ago

Not sure if this is an issue with hivtrace generating an illegal call, but running a certain command (hivnetworkcsv -i tn93output.csv -t .015 -f plain -J -q) fails for my toy data example.

Expected behaviour

Running hivnetworkcsv -i tn93output.csv -t .015 -f plain -J -q, on a csv file (generated by hivtrace, using tn93) works without error.

Content of `tn93output.csv`: ```csv ID1,ID2,Distance 55-00003,55-00003,0 55-00015,55-00015,0 55-00004,55-00004,0 55-00014,55-00014,0 55-00031,55-00031,0 55-00007,55-00007,0 55-00012,55-00012,0 55-00008,55-00008,0 55-00016,55-00016,0 55-00013,55-00013,0 55-00018,55-00018,0 55-00010,55-00010,0 55-00001,55-00001,0 55-00020,55-00020,0 55-00023,55-00023,0 55-00001,55-00002,0.0148724 55-00024,55-00024,0 55-00025,55-00025,0 55-00026,55-00026,0 55-00027,55-00027,0 55-00005,55-00005,0 55-00028,55-00028,0 55-00029,55-00029,0 55-00009,55-00009,0 55-00030,55-00030,0 55-00006,55-00006,0 55-00002,55-00002,0 55-00017,55-00017,0 55-00021,55-00021,0 55-00019,55-00019,0 55-00011,55-00011,0 55-00022,55-00022,0 ```

Actual behaviour

Running the command produces the following error:

Fitting the degree distribution to various densities
Traceback (most recent call last):
  File "/home/fynn/RKI/Sandbox/venv/bin/hivnetworkcsv", line 622, in <module>
    make_hiv_network()
  File "/home/fynn/RKI/Sandbox/venv/bin/hivnetworkcsv", line 55, in make_hiv_network
    network_info = describe_network(network, True, settings().singletons)
  File "/home/fynn/RKI/Sandbox/venv/lib/python3.8/site-packages/hivclustering/networkbuild.py", line 253, in describe_network
    distro_fit = network.fit_degree_distribution()
  File "/home/fynn/RKI/Sandbox/venv/lib/python3.8/site-packages/hivclustering/mtnetwork.py", line 2499, in fit_degree_distribution
    hy_instance = hy.HyphyInterface()
NameError: name 'hy' is not defined

Steps to reproduce

Reason

The fit_degree_distribution method of the transmission_network class is run without importing HyPhy before. A quick fix would be to just try to import hppy as hy like in _test_edge_support. But maybe it actually should run _test_edge_support here, because that does more than just import HyPhy I don't know.

FynnFreyer commented 1 year ago

This does not seem to be an issue with the development branch.

Produces the following output on stdout:

{"Network Summary":{"Edges":1,"Nodes":2,"Sequences used to make links":2,"Clusters":2,"Singletons":29},"Multiple sequences":{"Subjects with":0,"Followup, days":null},"Cluster sizes":[2],"HIV Stages":{"Unknown":2},"Directed Edges":{"Count":0,"Reasons for unresolved directions":{"Missing dates":1}},"Degrees":{"Distribution":[2],"Model":"Pareto","rho":53.66790117478374,"rho CI":[1.0907568780466308,10000.0],"BIC":0.6931471805599453,"fitted":null},"Settings":{"threshold":0.015,"edge-filtering":null,"contaminants":null,"singletons":true,"compact_json":true,"created":"2023-08-22T17:19:58.872191+00:00"},"Nodes":{"cluster":[1,1],"id":["55-00001","55-00002"]},"Edges":{"sequences":[["55-00001","55-00002"]],"directed":[false],"support":[0.0],"length":[0.0148724],"removed":[false],"attributes":[["BULK"]],"target":[1],"source":[0]}}

Since develop is a couple of commits behind main, I'd guess this was introduced in commit c899f91f5. Probably just forgot to add the import check in the other place as well.

I'll send a PR