rystanley / genepopedit

Simple and flexible manipulation of genomic data.
15 stars 4 forks source link

subset_genepop #16

Open JHHartman opened 1 year ago

JHHartman commented 1 year ago

Hello, I am trying to use subset_genepop to subset a set of loci names using "subs" in the getTopLoc function. When I get to this part of the code: genepopedit::subset_genepop(genepop = sub_data_path, subs = FST.Filter.Vec, keep = TRUE, path = paste0(path.start, "/", "subset_for_LD.txt")) I receive the error: "Error in [.data.frame(temp2, , c(subs, "Pop")) : undefined columns selected" I tried to run this piece of code outside of the getTopLoc function and I still couldn't get it to work. Has Anyone else had this problem/know how to fix it?

rystanley commented 1 year ago

Reading through the tealeaves this could be one of two things. 1) geneopedit can't discriminate the populations from within the SampleIDs. The format we had to force was a unique population name and a unique number separated by an underscore (e.g., ABC_01 will be picked up as population ABC and sample 01, but a sampleID entered as ABC01 would not work). Or it could be that the vector FST.Filter.Vec has a loci name (or names) not matching up with the genepop file in sub_data_path.

Try this:

setdiff(genepop_detective(sub_data_path,"Loci"),FST.Filter.Vec)

that should flag any loci names that don't match.

rystanley commented 1 year ago

Also note that we have pushed a small update to hybriddetective, which should alleviate the indexing error that creeped up with a data.table update. If you re-install it should hopefully work better.