Closed sylviesworld closed 3 years ago
mostShared.sh returns only blobs that are in 20 or more projects, in this case the shared blob is in one project only.
You can modify mostShared.sh (though it will be much slower) this line
join -v1 $i.fb $i.badfb | ~/lookup/getValues b2ManyP | ~/lookup/lsort 10G -t\; -k2 -n | head | sort -t\; -k1 > $i.fb2n
and change it to
join -v1 $i.fb $i.badfb | ~/lookup/getValues b2P | awk -F\; '{print $1";"(NF-1)}' | ~/lookup/lsort 10G -t\; -k2 -n | sort -t\; -k1 > $i.fb2n
How do I determine what blob bitbucket.org_thekswenson_alpha and genomecuration_JAMg share?
~/lookup/cmpO.sh bitbucket.org_thekswenson_alpha genomecuration_JAMg share
produces
comparing bitbucket.org_thekswenson_alpha and genomecuration_JAMg 1 blobs created in bitbucket.org_thekswenson_alpha used in genomecuration_JAMg 0 blobs created in genomecuration_JAMg used in bitbucket.org_thekswenson_alpha 111 shared between bitbucket.org_thekswenson_alpha and genomecuration_JAMg 4828 blobs unique to bitbucket.org_thekswenson_alpha 26933 blobs unique to genomecuration_JAMg created in bitbucket.org_thekswenson_alpha and present in genomecuration_JAMg 7a676be00044e5aa8fffd49d793ae9766cacf396;3rd_party/bin/gt created in genomecuration_JAMg and present in bitbucket.org_thekswenson_alpha
There are cases where running mostShared.sh on a project will not return anything, but when you run
zcat /da5_data/basemaps/gz/search.out | grep REPO | sort -t\; -n -k7
you get projects that use a blob originating from that project. Is there a good way to determine what the blob that they share is so that I can see what files were copied?An example would be:
zcat /da5_data/basemaps/gz/search.out | grep bitbucket.org_thekswenson_alpha | sort -t\; -n -k7
returns:
bitbucket.org_thekswenson_alpha;o;15;3;4939;4728;1;genomecuration_JAMg;u;402;1111;27044;10652;0 bitbucket.org_thekswenson_alpha;o;15;3;4939;4728;1;sestaton_tephra;o70;29;8;5429;4419;0 bitbucket.org_thekswenson_alpha;o;15;3;4939;4728;2;bitbucket.org_thekswenson_phagerecombination;o;1;0;195;191;0
How do I determine what blob bitbucket.org_thekswenson_alpha and genomecuration_JAMg share?