Closed slsevilla closed 1 year ago
I have a few questions/observations:
tr: write error:...
... what is tr
? Does it mean that the error occurred while running the command tr
or is it simply abbreviation for trace or something?checkquota
returns this
% checkquota|grep -i "khan\|clin"
/data(Clinomics): 22.0 TB 31.0 TB 71.03% 602950 32000000 1.88%
/data(khanlab): 218.4 TB 221.0 TB 98.80% 22798400 32000000 71.25%
/data(khanlab2): 93.2 TB 117.0 TB 79.65% 3320539 31457280 10.56%
/data(khanlab3): 155.9 TB 200.0 TB 77.97% 4860086 32000000 15.19%
which suggests we have lots of space under khanlab2
... so why Disk quota exceeded
?
Talked with Xinyu this morning and it was a memory issue (perhaps he had deleted files in between the errors and you running checkquota). He has a list of the projects affected and will delete these runs and restart.
Problem: Multiple errors seen across a significant number of samples in latest sample run.
Example errors from log
/data/khanlab2/processed_DATA/ngs_pipeline_SJ031111=SJEPD031111_D1=20220911_20221116_151637.log
:Review of one error log
log/FUSION_CATCHER.52732657.e
Error message:
Solution: It appears that the errors related to this project are due to disc space issues. Considering we are attempting to move analysis to a new location (related to problem with Biowulf (#12) this is a larger concern. We are not utilizing scratch space effectively and are keeping intermediate files not being used by downstream analysis, which leaves a large pipeline footprint per sample. Will need to determine a course of action to be able to handle the reprocessing of samples + new samples coming through the pipeline more effectively.