Closed alpole23 closed 10 months ago
Hello!
BiG-SCAPE 1.x assigns family numbers by using the index of the cluster which is chosen as the centroid by affinity propagation. This is why both the family may start at 0, and why there are large jumps in number (a family of 7 clusters will cause a jump of 1 through 7).
We are working on version 2.0 which assigns family numbers starting at 1, and assigns them consecutively. Unfortunately it is difficult to adapt version 1.x to do the same.
Hi adraismawur,
I understand. Thanks for the explanation!
I am running the latest bigscape conda package (v1.1.6)
Here is my command line:
python /home/a-m/alexp2/.conda/envs/bigscape_update/lib/python3.7/site-packages/bigscape/bigscape.py --mix --no_classify --include_singletons --clans-off --cutoffs 0.5 --inputdir /home/a-m/alexp2/antismash_results/antismash7/test_directory/ --outputdir /home/a-m/alexp2/bigscape_results/test_directory/ --pfam_dir /home/a-m/alexp2/multismash/pfam
My question is with how bigscape defines the family numbers. From the 'mix_clustering_c0.50.tsv' files in the network_files folder, one of the family numbers is zero. The family numbers also jump from 0 -> 1 -> 7 where I would expect it to be consecutive number increments like 1 -> 2 -> 3. The current family numbering scheme does not seem intuitive.
<> Is there a way to have bigscape start the numbering from one? <> Is there a particular reason for the large family number jumps? Or is it possible to get bigscape to assign them consecutively?
mix_clustering_c0.50.tsv example file contents: