Open jbruxaux opened 7 months ago
interesting behavior here. is there any reason to believe that that portion of the chromosome is non-recombining?
No, not at all. Each portion is 500Mb long (except the last one that is shorter) and harbor a lot of SNPs, and there is nothing different on any other parameter for that region (heterozygosity, nucleotide diversity...).
can you rerun the BSCORRECT module? I wonder if you will get a different answer?
I will have to wait for a few days (GPU nodes in maintenance on our cluster), but I will try as soon as possible.
you might also try this on a cpu?
Hi!
Sorry, it took me a few weeks to come back to this, but I re-ran the analysis on a smaller chromosome (only the BSCORRECT part), and I get the exact same result as last time (using the cmp command). So this part is at least consistent.
okay! this is very strange. any chance this has to do with seeds being maintained across runs? are you setting seeds manually?
Yes, I used the same seed! I re-ran again with different seeds, and here is what I get (for a smaller chromosome). The original run: The new run: For reference, the uncorrected results:
okay this is definitely a seed issue. when is the last time you pulled the code? we recently fixed a potential bug in commit #48 which may affect bootstrapping with a manual seed.
what version are you using @jbruxaux ? there was a bug fixed wrt to setting seeds a couple months ago (here) but you'll have to use the current github version of the repo
what version are you using @jbruxaux ? there was a bug fixed wrt to setting seeds a couple months ago (here) but you'll have to use the current github version of the repo
jinx
A quick update. I re-ran the analysis from scratch without any seed (and with the last ReLERNN version). I still see this pattern, but not as strongly. Here are the non-corrected results:
And here are the corrected results:
What do you think? As you can see, my third part (from 1000 in the graph, so 1Gb in reality) shows almost no correction compared to the two others.
Hey @jbruxaux sorry for the long silence ... if you still have your script handy, could you please post the exact invocation for BSCORRECT? Were you using the default arguments; or a copy-paste from the examples/example_pipeline.sh
(e.g. with --nSlice 2 --nReps 2
); or something else?
If it's the copy-paste from example_pipeline.sh
, then I think that's the issue -- two slices and two reps per slice is way to few (the defaults are 100 slices, 1000 reps per slice or thereabouts) which would make the correction very very noisy.
Thanks!
I indeed used the example_pipeline.sh script, with --nSlice 2 --nReps 2. So that explains the issue. I will give it another try with more reasonable values. Thank you for your feedback!
Hi!
I used ReLERNN to estimate the recombination rate along a very long genome, and ran the analyses by pieces of 500Mb. The results between the different parts are comparable when I use the results of the "predict" function, but differ a lot after correction with the "bscorrect" function. For example, before correction: And after correction: Any idea what could cause such differences? Is there anything I should do? Thanks in advance!