Closed flojopi closed 4 years ago
Hi Floriaan, you mind sending the genlight object rat8rec_reassign
by email ?
There's a problem with the genlight, not the tidying process, at the step where the GDS file is produced...
looking into it...
generating the GDS: there's a bug, now it's fix temporarily(It will need to be optimized, because now it requires the tidy.data, instead of just using the genlight(which is faster).
the conversion to fineradstructure...
The function looks for a column in the tidy data: GT_VCF_NUC
it's not generated by the tidying function because it can't find the information. The nucleotides in the genlight from dartR is stored in the LOCUS
name...
it's also in slot @loc.all
...
by default it will re-calibrate the alleles REF/ALT
The problem with your data and the conversion to fineradstructure are:
will have to sleep on this one and think about PLAN B.
Do you have the manual with input format detailed ? I could find one with those details about the letters for the pop... 🧐🤔
Ok so back on it and found a solution...
It become increasingly difficult for me to follow all the different naming schemes researcher uses, if they're is any strategy... Consequently, I have abandoned the idea of formating with letters the populations to generate the fineRADstrucure file.
Now, a couple of checks will generate errors if bad population ids are detected.
Adding the argument strata
in the function will allow people to rename easily the strata, so that I no longer spend time on this ;)
In your case, the problem was:
1) more population than alphabet letters
2) and even if we skip that part and use your pop names, you have 1 pop that start with: _
or -
: _Marlborough_Snd
Should work now with radiator v.1.1.5
Below different test you can make and see if normal you get an error or should pass. It also shows you the different ways to generate the output you want:
data <- radiator::tidy_genlight(
data = "rat8rec_reassign.rds",
gds = FALSE, # faster if you only want the fineradstructure file
verbose = TRUE
)
# works
test1 <- radiator::write_fineradstructure(data = data)
# error
test2 <- radiator::write_fineradstructure(data = data, strata = "floriaan.new.strata.tsv")
# works
test3 <- radiator::tidy_genlight(
data = "rat8rec_reassign.rds",
gds = FALSE
) %>%
radiator::write_fineradstructure(data = ., strata = "floriaan.new.strata.tsv")
# works
test4 <- radiator::genomic_converter(data = "rat8rec_reassign.rds", output = "fineradstructure")
# error
test5 <- radiator::genomic_converter(
data = "rat8rec_reassign.rds",
strata = "floriaan.new.strata.tsv",
output = "fineradstructure"
)
FYI
Since it's dartR filtered data I went to have a look, out of curiosity:
Bottom line, I think it's premature to run population structure on the dataset as it is...
good luck Re-open the issue if you're still having problem
I'm emailing the strata that I used for the code to work properly Thierry
Hi Thierry,
I have DArT data, which I have filtered using the dartR package and now I am using the filtered, final genlight object to try and export it to fineRAD structure input format, but I keep getting the error:
and that was after I applied the tidy_genomic_data function:
which appears to have worked fine.
Is there anything missing in that tidy_rat8rec_assign that the function needs to work? I am not sure about the "nucleotide information is required". Does that mean it wants the whole SNP sequence, because REF and ALT are in that table.
I was also thinking that maybe something is wrong with that genlight object, so I used
and I get the same error, so its not like any information got lost during the filtering process.
Thanks, Flo