Closed baizm closed 4 months ago
Note that fastStructure was written for Python 2 which has been end of support life for several years.
From: https://en.wikipedia.org/wiki/History_of_Python#Version_2
Python 2.7 support ended on January 1, 2020, along with code freeze of 2.7 development branch. A final release, 2.7.18, occurred on April 20, 2020, and included fixes for critical bugs and release blockers.[31] This marked the end-of-life of Python 2.[32]
There is a git pull request with fixes to work under Python3, which is nine years old at this point, and has still not been committed to the main git branch https://github.com/rajanil/fastStructure/pull/26
I will try to build with the patches in this git pull request, but I don't know if it will work, or if the resulting module will be functional.
fastStructure has been built, and is going through internal testing. I will update this request once fastStructure has been published.
fastStructure version 1.0 (with Python3 fixes) has been published to the ccrsoft/2023.01 software release
login1$ module spider fastStructure
----------------------------------------------------------------------------
faststructure: faststructure/1.0-Python-3.9.6
----------------------------------------------------------------------------
Description:
fastStructure is a fast algorithm for inferring population structure
from large SNP genotype data. It is based on a variational Bayesian
framework for posterior inference and is written in Python2.x.
You will need to load all module(s) on any one of the lines below before the "faststructure/1.0-Python-3.9.6" module is available to load.
gcc/11.2.0 openmpi/4.1.1
[...]
login1$ module load gcc/11.2.0 openmpi/4.1.1 faststructure/1.0-Python-3.9.6
login1$
There are three programs: structure.py, chooseK.py and distruct.py
login1$ structure.py
Here is how you can use this script
Usage: python /cvmfs/soft.ccr.buffalo.edu/versions/2023.01/easybuild/software/avx512/MPI/gcc/11.2.0/openmpi/4.1.1/faststructure/1.0-Python-3.9.6/structure.py
-K <int> (number of populations)
--input=<file> (/path/to/input/file)
--output=<file> (/path/to/output/file)
--tol=<float> (convergence criterion; default: 10e-6)
--prior={simple,logistic} (choice of prior; default: simple)
--cv=<int> (number of test sets for cross-validation, 0 implies no CV step; default: 0)
--format={bed,str} (format of input file; default: bed)
--full (to output all variational parameters; optional)
--seed=<int> (manually specify seed for random number generator; optional)
login1$
login1$ chooseK.py
Here is how you can use this script
Usage: python /cvmfs/soft.ccr.buffalo.edu/versions/2023.01/easybuild/software/avx512/MPI/gcc/11.2.0/openmpi/4.1.1/faststructure/1.0-Python-3.9.6/chooseK.py
--input=<file>
login1$
login1$ distruct.py
Here is how you can use this script
Usage: python /cvmfs/soft.ccr.buffalo.edu/versions/2023.01/easybuild/software/avx512/MPI/gcc/11.2.0/openmpi/4.1.1/faststructure/1.0-Python-3.9.6/distruct.py
-K <int> (number of populations)
--input=<file> (/path/to/input/file; same as output flag passed to structure.py)
--output=<file> (/path/to/output/file)
--popfile=<file> (file with known categorical labels; optional)
--title=<figure title> (a title for the figure; optional)
login1$
Software name: fastStructure Preferred version: latest Website: https://rajanil.github.io/fastStructure/