Estimation of current effective population using artificial neural networks
It should be included in (almost) every linux distribution. To install it in debian-like distributions:
sudo apt install g++
This is only needed if you want to use the make commands to compile the program. You could directly compile it using g++.To install it just run the following command (in debian-like distributions):
sudo apt install make
Clone the github repo:
git clone https://github.com/esrud/currentNe
Compile:
cd currentNe
make
Another way to compile the program, which will link it statically:
make static
The program has been tested on Arch Linux, Ubuntu 20.04 and Debian Buster, using g++ version 7.2.0 and above as a compiler.
Please note that the program needs at least 3 Gb of free RAM to be able to run. This is dependent on the amount of loci and individuals to be sampled, as well as the maximum number of chromosomes and the maximum distance between loci. This can be increased (or reduced) by tweaking the constants MAXLOCI, MAXIND, MAXCROMO and MAXDIST respectively. Note that increasing those values will increase the free RAM requirements.
currentNe - Current Ne estimator (v1.0 - Jan 2023)
Authors: Enrique Santiago - Carlos Köpke
USAGE: ./currentNe [OPTIONS] <filename_with_extension> <number_of_chromosomes>
where filename is the name of the data file in vcf, ped or tped
format. The filename must include the name extension
.vcf, .ped or .tped according to its format.
If the assignments of SNPs to chromosomes are available in the
input file, an additional estimate based only on pairs of SNPs
located on different chromosomes will also be calculated.
If the ped format is used, this additional estimate will be
made if the corresponding map file is available in the same
directory as the ped file.
OPTIONS:
-h Print out this manual.
-s Number of SNPs to use in the analysis (all by default).
-k -If a POSITIVE NUMBER is given, the number of full siblings that
a random individual has IN THE POPULATION (the population is the
set of reproducers). With full lifetime monogamy k=2, with 50%
of monogamy k=1 and so on. With one litter per multiparous
female k=2, with two litters per female sired by the same father
k=2 but if sired by different fathers k=1, in general, k=2/Le
where Le is the effective number of litters (Santiago et al. 2023).
-If ZERO is specified (i.e., -k 0), each offspring is assumed to
be from a new random pairing.
-If a NEGATIVE NUMBER is specified, the average number of full
siblings observed per individual IN THE SAMPLE. The number k of
full siblings in the population will be estimated along with Ne.
-BY DEFAULT, i.e. if the modifier is not used, the average number
of full siblings k will be estimated from the input data.
-o Specifies the output filename. If not specified, the output
filename is built from the name of the input file.
-t Number of threads (default: 8)
-q Run quietly. Only prints out Ne estimation
-p Print the analysis to stdout. If not specified a file will be created
-v Only used with -q. Prints also the bounds for the two confidence
intervals of 50% and 90%.
EXAMPLES:
- Random mating and 20 chromosomes (equivalent to a genome of 20 Morgans),
assuming that full siblings are no more frequent than expected under
random pairing (each offspring from a new random pairing).
./currentNe -k 0 filename 20
- Same as before but only a random subsample of 10000 SNPs
will be analysed:
./currentNe -k 0 -s 100000 filename 20
- Same as before but full siblings could be more frequent than expected
under random pairing. Full siblings will be identified from the
genotyping data in the ped file:
./currentNe -s 100000 filename 20
- Two full siblings per individual (k = 2) IN THE POPULATION:
./currentNe -k 2 filename 20
- An 80% of lifetime monogamy in the population. Output filename specified:
./currentNe -k 1.6 -o SS81out filename 20
(with a monogamy rate m = 0.80, the expected number of full
siblings that a random individual has is k = 2*m = 1.6)
- If 0.2 full siblings per individual are OBSERVED IN THE SAMPLE:
./currentNe -k -0.2 filename 20
(NOTE the MINUS SIGN before the number of full sibling 0.2)