Closed kasmiyassin closed 6 months ago
Could you send by email the necessary files to reproduce the error please.
Ok thanks So I usually run a couple of analysis before using other functions in radiator...
test1 <- radiator::read_vcf(data = "populations.snps.vcf", strata = "info_sex.txt")
By default...
Here are a few things that I wanted to bring to your attention.
Detected 1536 duplicate SNPs on different strands (+/-)
By default radiator prune those SNPs
To change this behavior, use argument: filter.strands
Filter duplicated markers on different strands: blacklist
Number of individuals / strata / chrom / locus / SNP:
Before: 27 / 2 / 342 / 24642 / 58873
Blacklisted: 0 / 0 / 0 / 373 / 1536
After: 27 / 2 / 342 / 24269 / 57337
For info on the internal function do this in RStudio or R console: ?radiator::filter_monomorphic
Now the numbers are not small...
Filter monomorphic markers
Number of individuals / strata / chrom / locus / SNP:
Blacklisted: 0 / 0 / 2 / 839 / 5369
Which begs the question... Did you filter your dataset before attempting the sexy_markers function ? Because this function goes deeper into the dataset but requires a clean/filtered dataset otherwise you find anything really of interest.
By default theradiator::read_vcf
function keeps only common markers but for certain analysis you might be interested in keeping those. Here, you have 631 SNPs not in common. If you look at the folder generated by the read function you will see a folder named something like that where the first 2 digits might be different 04_filter_common_markers
. Inside this is a UpSetR figure showing you have 51337 SNPs in common, 468 SNPs only found in males and 163 SNPs only found in females.
You might want to double check if this is artifactual of your dataset, filtering etc. Did you use this strata file to generate the VCF in Stacks ?
The markers info is found in the file starting with this: blacklist.not.common.markers
Filter common markers:
Number of individuals / strata / chrom / locus / SNP:
Blacklisted: 0 / 0 / 1 / 83 / 631
This alone could be your answer to distinguish males/females
Did you keep in the dataset duplicated individuals or technical replicates ? Because these samples can impact analysis ... a lot.
Run this:
check1 <- radiator::detect_duplicate_genomes(data = test1)
Look at the figures it generated.
In your strata we see
F1_MEM203 and F2_MEME203
I thought at first these were the same sample but if you open the file individuals.pairwise.dist.tsv
you see inside it's for example: F2-MEM198 and F2-MEM207
that are very very close. At the point that if you keep only 1 sample between identified duplicates, and remove only the sample with more missing genotypes, you will end up less than half the sample size (blacklisting 16 samples). 11 samples remaining with 4 females and 7 males...
You might want to check this before going further in your analysis.
I have a VCF file generate from Stacks2, and I prepared my STRAT fille as described but I recieve all time this error:
when I use the commands
Then the command
I tested also the command
Then the command
but I got the same error.
My strata looks like
Can anyone help me to solve this issue please?