ghm17 / LOGODetect

LOGODetect is a powerful tool to identify small segments that harbor local genetic correlation between two traits/diseases.
GNU General Public License v3.0
19 stars 5 forks source link

Remove empty directories created by aggregation.R #2

Closed jdblischak closed 3 years ago

jdblischak commented 3 years ago

The script aggregation.R creates many subdirectories per chromosome in Data/ similar to how random_vector_generation.R creates many subdirectories per chromosome in Temp/. However, as far as I can tell, the directories created in Data/ are never used.

aggregation.R writes its output files directly to the chromosome directory:

https://github.com/ghm17/LOGODetect/blob/f6877934b1b89bd4e63ba2306ffbf11b36de3d44/Code/aggregation.R#L47

https://github.com/ghm17/LOGODetect/blob/f6877934b1b89bd4e63ba2306ffbf11b36de3d44/Code/aggregation.R#L60

And BiScan_null.R also ignores those empty subdirectories:

https://github.com/ghm17/LOGODetect/blob/f6877934b1b89bd4e63ba2306ffbf11b36de3d44/Code/BiScan_null.R#L118-L122

ghm17 commented 3 years ago

Thank you for the comments.

The two scripts random_vector_generation.R and aggregation.R aim at creating random vectors which are draw from the normal distribution with mean zero and covariance matrix V or V^2(here V denote the LD matrix), and the final outputs are saved per chromosome in directory Data/, (e.g. Data/random_ld/chr1/s_1.txt, ..., Data/random_ld/chr1/s_10000.txt represent 10000 random vectors sampling from N(0, V^2)). The directory Temp/ tries to save the intermediate data, and the files under this directory will be removed in script aggregation.R.

As you can see, we have created many random vectors which occupy pretty much storage space. In return, we can reduce memory usage in BiScan_null.R.

jdblischak commented 3 years ago

@ghm17 Thanks for the additional context and explanation. However, I don't believe you addressed the changes I made in this Pull Request. I think that the script aggregation.R is needlessly creating many empty directories that are never used (and not deleted afterwards). Do you disagree? If yes, could you please point me to the lines in your scripts where directories such as Data/random_ld/chr1/1/ are being used?

ghm17 commented 3 years ago

@jdblischak You are right. The subdirectories Data/random_ld/chr1/1/ are never used. Thank you very much for pointing out this.