Closed JielinLi closed 7 months ago
These are tabular text files that contain SNPs of interest; i.e., they are used for specifying a list of SNPs to run the selection pipeline on.
For example, "mathieson" points to a file called 41586_2015_BFnature16152_MOESM270_ESM.txt
which is the 'Supplementary Data 3' table from Mathieson et al. (2015) https://www.nature.com/articles/nature16152#Sec14
I understand, thank you very much. I also wanted to ask, Why is 'chr3_true_paths' listed separately in the input file?
We made two alternative versions of the chr3 simulations, one in which the ancestral paths were inferred by the machine learning classifier (i.e., chr3_inferred_paths
) and another in which the genotypes were labeled with the true ancestral paths from the simulation (i.e., chr3_true_paths
). This allowed us to isolate the effects of path mislabelling on the selection test, and to confirm that there was no major bias.
Thank you very much for your patient response! Does it mean that I can choose not to include this file when I'm working? I also wanted to ask, what is the difference between 'ancestral_paths_new' and 'ancestral_paths_v3'? They seem to have the same metadata.
Yes, you can choose to omit any/all of these input files. All you need are the input files necessary to make the specific outputs you are requesting from the snakemake
scheduler.
The difference between ancestral_paths_new
and ancestral_paths_v3
is that the ancestral path local ancestry model was updated during review of the paper and a new version (v3) was created.
I understand, thank you so much!
Hello, when using RELATE, I download 'humanancestor{chr}.fa,' from 'http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/supporting/ancestral_alignments/'. I want to know that if the files are the same as '{chr}.humanc_e71.fa' in your 'relate.smk'? Thank you!
No, they are not the same version of the human ancestral sequence. The version you linked to is from Ensembl build e59
and the version I used is e71
. In practice, this will probably not have a major impact, as there are unlikely to have been major changes in the inference of the ancestral alleles between these two builds.
Thank you so much!
Hello, I want to ask what are the input files of 'imputed', 'andres', 'inv17_h1h2', 'mathieson' in config? Thank you!