veg / hivclustering

Infer molecular transmission networks from pairwise distance files (part of HIV-TRACE)
2 stars 5 forks source link

subcluster #36

Open lwxing opened 3 years ago

lwxing commented 3 years ago

I have read the subclusters.md,but I still met some issue when I use the example of cmdline .

issue1: root@iZm5e768xkpnkqcbiedc13Z:~/hivclustering# hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp \

-p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" --before 20090101 \ -O tests/subclusters/pirc-2009.json Fitting the degree distribution to various densities Traceback (most recent call last): File "/usr/local/bin/hivnetworkcsv", line 620, in make_hiv_network() File "/usr/local/bin/hivnetworkcsv", line 517, in make_hiv_network if prior_network_subclusters: UnboundLocalError: local variable 'prior_network_subclusters' referenced before assignment

issue2: root@iZm5e768xkpnkqcbiedc13Z:~/hivclustering# hivnetworkcsv -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 \

-f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" \ -P tests/subclusters/pirc-2009.json -j -O tests/subclusters/pirc.json Fitting the degree distribution to various densities Error with prior network processing: Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "/usr/local/bin/hivnetworkcsv", line 620, in make_hiv_network() File "/usr/local/bin/hivnetworkcsv", line 108, in make_hiv_network prior_network = ht_process_network_json(json.load (settings().prior)) File "/usr/lib/python3.6/json/init.py", line 299, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.6/json/init.py", line 354, in loads return _default_decoder.decode(s) File "/usr/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

issue3: root@iZm5e768xkpnkqcbiedc13Z:~/hivclustering# hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" --before 20090101 -P tests/prior_networks/prior.json -O tests/subclusters/pirc-2009.json Fitting the degree distribution to various densities [WARNING] Removed 1698 nodes from the previous network,...... ...... Added 475 edges compared to the prior network Traceback (most recent call last): File "/usr/local/bin/hivnetworkcsv", line 620, in make_hiv_network() File "/usr/local/bin/hivnetworkcsv", line 568, in make_hiv_network combined_id = "%d.%d" % (cluster_id, subcluster_id) TypeError: %d format: a number is required, not NoneType

spond commented 3 years ago

Dear @lwxing,

Thanks for reporting this issue. There was a bug in the code that caused your issue 1, and the others followed from it. I just commit a fix. Now you see:

hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp  -p 0 "^([^|]+)\|([0-9]+)" -p 0 "^([^|]+)\|([0-9]+)" --before 20090101 -O tests/subclusters/pirc-2009.json                    
Fitting the degree distribution to various densities
At threshold 0.01 there were 77 subclusters
At threshold 0.005 there were 54 subclusters
hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.01 -f regexp  -p 0 "^([^|]+)\|([0-9]+)" -p 0 "^([^|]+)\|([0-9]+)" --before 20090101 -O tests/subclusters/pirc.json -P tests/subclusters/pirc-2009.json     
Fitting the degree distribution to various densities
Cluster 1 [19 nodes] matches previous cluster 391 [19 nodes]
Cluster 2 [2 nodes] matches previous cluster 459 [2 nodes]
Cluster 3 [8 nodes] matches previous cluster 392 [8 nodes]
Cluster 4 [2 nodes] matches previous cluster 450 [2 nodes]
Cluster 5 [3 nodes] matches previous cluster 404 [3 nodes]
Cluster 6 [3 nodes] matches previous cluster 402 [3 nodes]
Cluster 7 [2 nodes] matches previous cluster 451 [2 nodes]
Cluster 8 [3 nodes] matches previous cluster 405 [3 nodes]
Cluster 9 [4 nodes] matches previous cluster 400 [4 nodes]
Cluster 10 [2 nodes] matches previous cluster 447 [2 nodes]
Cluster 11 [6 nodes] matches previous cluster 397 [6 nodes]
Cluster 12 [2 nodes] matches previous cluster 461 [2 nodes]
Cluster 13 [7 nodes] matches previous cluster 394 [7 nodes]
Cluster 14 [2 nodes] matches previous cluster 416 [2 nodes]
Cluster 15 [4 nodes] matches previous cluster 401 [4 nodes]
....
hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.01 -f regexp  -p 0 "^([^|]+)\|([0-9]+)" -p 0 "^([^|]+)\|([0-9]+)" --before 20090101 -P tests/prior_networks/prior.json -O tests/subclusters/pirc-2009.json 
Fitting the degree distribution to various densities
....
Cluster 58 is a new cluster
Cluster 59 is a new cluster
Cluster 60 is a new cluster
Cluster 61 is a new cluster
Cluster 62 is a new cluster
Cluster 63 is a new cluster
Cluster 64 is a new cluster
Cluster 65 is a new cluster
Cluster 66 is a new cluster
Cluster 67 is a new cluster
Cluster 68 is a new cluster
Cluster 69 is a new cluster
Cluster 70 is a new cluster
Cluster 71 is a new cluster
Cluster 72 is a new cluster
Cluster 73 is a new cluster
Cluster 74 is a new cluster
Cluster 75 is a new cluster
Cluster 76 is a new cluster
Cluster 77 is a new cluster
Added 273 edges compared to the prior network

Best, Sergei

lwxing commented 3 years ago

Dear @spond, Thank you for your reply. I have changed the hivnetworkcsv 和hivnetworkannotete as you commit the fix. The issue1 have solved ,but issue3 still exists.

root@iZm5e768xkpnkqcbiedc13Z:~/hivclustering# hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" --before 20090101 -O tests/subclusters/pirc-2009.json Fitting the degree distribution to various densities Traceback (most recent call last): File "/usr/local/bin/hivnetworkcsv", line 619, in make_hiv_network() File "/usr/local/bin/hivnetworkcsv", line 567, in make_hiv_network combined_id = "%d.%d" % (cluster_id, subcluster_id) TypeError: %d format: a number is required, not NoneType

spond commented 3 years ago

Dear @lwxing,

Try modifying your regular expressions to explicitly escape the | character so instead of

"^([^|]+)|([0-9]+)"

use

"^([^|]+)\|([0-9]+)"

Best, Sergei

lwxing commented 3 years ago

image

spond commented 3 years ago

Dear @lwxing,

Hmm, this is curious.

% hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)\|([0-9]+)" -p 0 "^([^|]+)\|([0-9]+)" --before 20090101 -O tests/subclusters/pirc-2009.json
Fitting the degree distribution to various densities
At threshold 0.01 there were 77 subclusters
At threshold 0.005 there were 54 subclusters

What's your python version?

Best, Sergei

lwxing commented 3 years ago

% hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" --before 20090101 -O tests/subclusters/pirc-2009.json

Python 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0] on linux

spond commented 3 years ago

Dear @lwxing,

I am not sure why the issue occurs because I currently can't replicate it (I have python 3.9). I'll keep the issue open.

Best, Sergei

liamxg commented 10 months ago

@spond please help me out:

(base) simon@192 hivclustering % hivnetworkcsv -j -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" --before 20090101 -O tests/subclusters/pirc-2009.json Fitting the degree distribution to various densities Traceback (most recent call last): File "/Users/simon/opt/miniconda3/bin/hivnetworkcsv", line 4, in import('pkg_resources').run_script('hivclustering==1.6.3', 'hivnetworkcsv') File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 720, in run_script self.require(requires)[0].run_script(script_name, ns) File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 1559, in run_script exec(code, namespace, namespace) File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/hivclustering-1.6.3-py3.9.egg/EGG-INFO/scripts/hivnetworkcsv", line 622, in make_hiv_network() File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/hivclustering-1.6.3-py3.9.egg/EGG-INFO/scripts/hivnetworkcsv", line 569, in make_hiv_network combined_id = "%d.%d" % (cluster_id, subcluster_id) TypeError: %d format: a number is required, not NoneType

liamxg commented 10 months ago

@spond I also use python 3.9.

(base) simon@192 hivclustering % hivnetworkcsv -i tests/subclusters/pirc.csv -t 0.02,0.01,0.005 -f regexp -p 0 "^([^|]+)|([0-9]+)" -p 0 "^([^|]+)|([0-9]+)" -P tests/subclusters/pirc-2009.json -j -O tests/subclusters/pirc.json Fitting the degree distribution to various densities Error with prior network processing: Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "/Users/simon/opt/miniconda3/bin/hivnetworkcsv", line 4, in import('pkg_resources').run_script('hivclustering==1.6.3', 'hivnetworkcsv') File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 720, in run_script self.require(requires)[0].run_script(script_name, ns) File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 1559, in run_script exec(code, namespace, namespace) File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/hivclustering-1.6.3-py3.9.egg/EGG-INFO/scripts/hivnetworkcsv", line 622, in make_hiv_network() File "/Users/simon/opt/miniconda3/lib/python3.9/site-packages/hivclustering-1.6.3-py3.9.egg/EGG-INFO/scripts/hivnetworkcsv", line 109, in make_hiv_network prior_network = ht_process_network_json(json.load (settings().prior)) File "/Users/simon/opt/miniconda3/lib/python3.9/json/init.py", line 293, in load return loads(fp.read(), File "/Users/simon/opt/miniconda3/lib/python3.9/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/Users/simon/opt/miniconda3/lib/python3.9/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Users/simon/opt/miniconda3/lib/python3.9/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)