It appears that during preprocessing, GRIDSS creates a pdf report with an R script. We find that sometimes this hangs silently (for very long time) and the entire step fails. Is it possible to disable this reporting? I found that the problem arises with inputs with a lot (>1M) lines. This metric reports very long inset sizes and it may be practical to introduce a hard stip, for example after 5-10K.
Adding this lines after loading histogram in picard/analysis/insertSizeHistogram.R
also improves the situation without altering plot significantly:
# Sub-sample metric data. This will prevent long run times with large inputs
histogram_rows<-sample(1:nrow(histogram), min(nrow(histogram), 10000), replace = FALSE)
histogram = histogram[histogram_rows,]
ORDERED<-order(histogram$insert_size)
histogram<-histogram[ORDERED,]
It appears that during preprocessing, GRIDSS creates a pdf report with an R script. We find that sometimes this hangs silently (for very long time) and the entire step fails. Is it possible to disable this reporting? I found that the problem arises with inputs with a lot (>1M) lines. This metric reports very long inset sizes and it may be practical to introduce a hard stip, for example after 5-10K.
Adding this lines after loading histogram in picard/analysis/insertSizeHistogram.R also improves the situation without altering plot significantly: