smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

ConstructNetwork temp .rda file location #182

Open DelongZHOU opened 5 months ago

DelongZHOU commented 5 months ago

Describe the bug I believe the function ConstructNetwork creates temporary .rda files in the cwd of R instead of saving to the path provided for the tom files. I was running the pipeline for 3 datasets on a cluster and found that each was creating / overwriting over the temp file with the same name. Currently I'm creating a separated folder for each of my job, but it'd be best to save the temp files to their respective output path and add appropriate prefixes.

Thanks!

smorabit commented 5 months ago

Hi, I think that for now it is safest to run ConstructNetwork in separate directories as you are doing if you are running hdWGCNA simultaneously as you have described. Just so I know exactly what file you are talking about, what are the names or the temp files that are being created?

DelongZHOU commented 5 months ago

Hi, I can't remember exactly but it's something like block.rda and consensus.rda From what I remembered in my error messages the consensus file is eventually saved as the tom file to the path specified in the function parameter.

rdalbanus commented 5 months ago

Hi Zhou Delong, I was doing some digging in this repo for a related question and stumbled upon your issue. ConsensusTOM-block.*.rda are hardcoded intermediary file names in the ConstructNetwork function, so there is no way to run multiple datasets in parallel in the same directory. You need to run them in separate dirs, like Dr. Morabito suggested. Cheers

smorabit commented 5 months ago

Hi,

I updated ConstructNetwork to create the temp files in the tom_outdir and to append tom_name to the file names. So if you are running ConstructNetwork on a cluster you should simply specify these parameters to avoid the overwriting conflicts.

DelongZHOU commented 2 months ago

Hi Sam, I'm re-opening this issue. The temp tom file is now named "individualTOM-Set1-Block1.RData" so it starts to crash between my jobs again. Best, Delong

smorabit commented 2 months ago

Interesting, I did not make any changes to that function so I am not sure why the behavior is changing all of the sudden. I will look into it and try to see if I can replicate this on my end.

smorabit commented 2 months ago

Just tried it on my end, and the behavior has not changed for me... To clarify if we run ConstructNetwork like this:


seurat_obj <- ConstructNetwork(
    seurat_obj, 
    tom_outdir='TOM_test',
    tom_name='testing', 
    overwrite_tom=TRUE
  )

The temporary file name(s) should be {tom_name}_block.1.rda. So in this example it would be testing_block.1.rda.

Can you please share the code hat you are using for ConstructNetwork?

DelongZHOU commented 2 months ago

My code is similar, only difference is without tom_outdir:

seurat_obj <- ConstructNetwork( seurat_obj, setDatExpr=FALSE, minModuleSize = 100, tom_name = paste0('psy+ctrl.',celltype), # name of the topoligical overlap matrix written to disk overwrite_tom = TRUE )

Edit: formatting

DelongZHOU commented 1 month ago

Hi Sam,

I'm monitoring the temp file, and I found that in the TOM folder there is indeed a temp file named <>_block.1.rda, but in addition to that, in the folder where the script is ran, there is another temp file titled individualTOM-Set1-Block1.RData and it's this file that causes conflicts.

Does the code produce this .RData at your end as well?

Best, Delong