JaneliaSciComp / msg

Multiplexed Shotgun Genotyping
http://genomics.princeton.edu/AndolfattoLab/MSG.html
11 stars 12 forks source link

summaryPlots.R - Generate Off-diagonal LOD on Thinned Data instead of full data #34

Open gregpinero opened 12 years ago

gregpinero commented 12 years ago

(Also put in delete and garbage collection statements to try to save memory.)

Misc notes:

My guess is we don't need to perform step 2 on the complete data set. Calculating the LOD profile on the thinned data should produce nearly identical results. This may save a lot of computation time (since this is an NXN calculation).

I think we should do step 5 and then step 2 on the thinned data. Peter, any thoughts?

I am rerunning the R script to see if I hit the same error.

D

On Jul 17, 2012, at 3:03 PM, Pinero, Gregory wrote:

Hi guys,

I investigated some more.

Basically the script has these steps: First theres a section called "Full marker set, no thinning" which reads in the ancestry-probs-.all.rda files and writes out ancestry-probs-.tsv files. Then there's a section for the Off-diagonal LOD profile where it reads in the ancestry-probs-.plot.rda files and creates the -offdiag.rda files, offdiagonal_data.tsv file, and offidiagonal-lod.pdf. Then there's a section called segregation proportions that creates segregation.pdf. Then a section to create missing.pdf. Finally it does the thinning (using the small thinned RDA files) creating the lod-matrix.bmp (this is where the error happened) (Steps 1 and 2 are using the large files)

There isn't a built in way to turn off any of the steps.

I tried rerunning this script with your data and I couldn't reproduce the error. Do you want to try another run and see if it happens again*?

Here are a couple of options I see for fixing it if you hit it again: I could add some rm() statements before the thinning section to remove data no longer needed from memory, and then force a garbage collection with the gc() statement. (It might not help but it's easy to try.) We could add some switches to turn off some of the steps above if you don't think you need them all. Let me know what you think.

-Greg

Here is the command your run called to launch this script. You could test just running that and see what happens e.g.,

qlogin -l mem96=true,excl=true -now n cd /groups/stern/home/sternd/V Rscript msg/summaryPlots.R -c 2,3,4,X -p 2,3,4,X -d hmm_fit -t 1 -f .5 -b barcodes.txt -n .05