Closed silknets closed 1 year ago
you just ran out of SNPs for this step to work...
Check this part : radiator::filter_snp_position_read
in the lines inside this file: session.txt
If you have already filtered your data elsewhere (stacks?) why put it in filter_rad
?
Here is the checks I usually do when I receive a dataset:
data <- radiator::read_vcf(data = "populations.snps.vcf")
VCF summary
Missing data:
markers: 0.17
individuals: 0.17
Coverage info:
individuals mean total coverage: 361750
individuals mean genotype coverage: 95
markers mean coverage: 100
VCF info:
Number of chromosome/contig/scaffold: 1
Number of locus: 199
Number of markers: 4445
Number of strata: 1
Number of individuals: 188
Number of ind/strata:
1pop = 188
I didn't have your strata file, but really no need to do these checks ...
**I see several problems:***
Duplicate check, that's the part that didn't go well in filter_rad
because you had no markers left. The same analysis done independently on your VCF:
dup <- radiator::detect_duplicate_genomes(data = data)
Would really like to know what you're working on, but when I see the graph I'm seeing it's usually with very very close samples (close kin, families, etc) and technical and / or lab duplicates.
This last check look for wet lab trouble, mix samples, etc
mix <- radiator::detect_mixed_genomes(data = data)
These samples (sb0302, sb0405, sb0417) are not like the rest, they are definitely outliers. When I see this it's usually another species or something went wrong in the wet lab...
Hope this help, re-open an issue if you have another problem
Hello All,
I'm a new Radiator user, and I imagine my error has a simple resolution attributable to the user. However, I've been unable to resolve this by reviewing closed issues or digging into this GitHub repo. I'm adding this blank issue in the hopes that I can have it resolved, or to better understand why the error was thrown!
For some context, I'm running filter_rad() on my strictest VCF output from a Stacks:populations run (only 199 loci for 188 samples). I also selected pretty generic filters for the filter_rad(), just to start making sense of the outputs. However, when the run has completed, I noted the error below. As a new user, it's unclear if this error has prevented the generation of additional plots / figures that would help me to assess the sequence data + underlying biology going on. I'm including four files for additional info if needed per contributing guidelines (devtools::session_info, full error text, text of full session for info on filters selected, and a zipped folder with the VCF input).
Thanks from a grateful user - Sam S.
###################### radiator::detect_duplicate_genomes ###################### ################################################################################ Execution date@time: 20230315@1148 Function call and arguments stored in a file File written: radiator_detect_duplicate_genomes_args_20230315@1148.tsv File written: random.seed (314710)
Error in
dplyr::mutate()
: ℹ In argument:MISSING_PROP = round(...)
. Caused by error inseqParallel()
: ! No variants selected. Runrlang::last_error()
to see where the error occurred. Warning messages: 1: There was 1 warning indplyr::mutate()
. ℹ In argument:WHITELISTED_MARKERS = purrr::map_int(...)
. Caused by warning: ! Using one column matrices infilter()
was deprecated in dplyr 1.1.0. ℹ Please use one dimensional logical vectors instead. ℹ The deprecated feature was likely used in the dplyr package. Please report the issue at https://github.com/tidyverse/dplyr/issues. This warning is displayed once every 8 hours. Calllifecycle::last_lifecycle_warnings()
to see where this warning was generated. 2: Removed 144 rows containing missing values (geom_point()
). 3: Removed 130 rows containing missing values (geom_point()
).session_info.txt full_error.txt session.txt populations.snps.zip