MStarmans91 / WORC

Workflow for Optimal Radiomics Classification
Other
66 stars 19 forks source link

A dot ('.') is added to a process yaml filepath #85

Open pianoza opened 9 months ago

pianoza commented 9 months ago

Describe the bug When running the BasicWORC tutorial, it runs without any issue using the 'predict/CalcFeatures:1.0' configuration for FeatureCalculators. However, when using the 'pyradiomics/Pyradiomics:1.0', the features are not calculated. I ran a fastr trace like this: fastr trace $RUNDIR/WORC_Example_STWStrategyHN_BasicWORC/tmp/__sink_data__.json --sinks features_train_CT_0_pyradiomics --sample HN1006_0 -v The output is this: fastr.exceptions.FastrFileNotFound: FastrFileNotFound from $PYTHONENVPATH/envs/worc/lib/python3.7/site-packages/fastr/abc/serializable.py line 152: Could not find file Could not open $RUNDIR/WORC_Example_STWStrategyHN_BasicWORC/tmp/.calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0/HN1006_0/__fastr_result__.yaml for reading

In the tmp folder, the path exists under name "calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0". But for some reason fastr is looking for a hidden folder ".calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0" that doesn't exist.

WORC configuration Default config with overrides on "FeatureCalculators" to use pyradiomics. In WORC_config.py, the mount folders for tmp and output were changed as well to be inside the $RUNDIR/experimentfolder

Desktop (please complete the following information):

Update It seems the problem is not only related to pyradiomics. For some reason, sometimes fastr keeps looking for a hidden folder. For example, for all the failed sinks (Barchart_PNG, classification, StatisticalTestFeatures_CSV, etc) fastr is looking for a hidden folder .classify instead of the existing folder classify: fastr trace $RUNDIR/tmp/__sink_data__.json --sinks Barchart_PNG -v --sample all [WARNING] __init__:0084 >> Not running in a production installation (branch "unknown" from installed package) Tracing errors for sample all from sink Barchart_PNG Located result pickle: $RUNDIR/tmp/.classify/all/__fastr_result__.yaml Traceback (most recent call last): File "$HOME/miniconda3/envs/worc/lib/python3.7/site-packages/fastr/abc/serializable.py", line 142, in loadf with open_func(path, 'rb') as fin: FileNotFoundError: [Errno 2] No such file or directory: '$RUNDIR/tmp/.classify/all/__fastr_result__.yaml'

hachterberg commented 9 months ago

Hi @pianoza,

could you give the entire traceback? I am trying to see where this can go wrong, it seems the node directories are starting with a dot, but I cannot think of why that would be. I assume the fastr nodes don't have a dot in their label? I need to see from the complete traceback about the call that actually requests the loading of the file, because probably there the dot is already introduced somehow.

Kind regards, Hakim

pianoza commented 8 months ago

Hi @hachterberg, thank you for the reply and apologies for the delayed response. I had deleted all the temp files from that run unfortunately and will try to reproduce it again soon.

However, I have done some more tests using PREDICT as the feature extractor that runs ok. When running WORC with a small number of samples everything goes well. But It crashes when running it with 1493 samples. Here is the full trace of the run with 1493 samples:

 [WARNING]  __init__:0084 >> Not running in a production installation (branch "unknown" from installed package)
Barchart_PNG -- 1 failed -- 0 succeeded
Barchart_Tex -- 1 failed -- 0 succeeded
BoxplotsFeatures_Zip -- 1 failed -- 0 succeeded
Decomposition_PNG -- 1 failed -- 0 succeeded
Hyperparameters_CSV -- 1 failed -- 0 succeeded
PRC_CSV -- 1 failed -- 0 succeeded
PRC_PNG -- 1 failed -- 0 succeeded
PRC_Tex -- 1 failed -- 0 succeeded
ROC_CSV -- 1 failed -- 0 succeeded
ROC_PNG -- 1 failed -- 0 succeeded
ROC_Tex -- 1 failed -- 0 succeeded
RankedPercentages_CSV -- 1 failed -- 0 succeeded
RankedPercentages_Zip -- 1 failed -- 0 succeeded
RankedPosteriors_CSV -- 1 failed -- 0 succeeded
RankedPosteriors_Zip -- 1 failed -- 0 succeeded
StatisticalTestFeatures_CSV -- 1 failed -- 0 succeeded
StatisticalTestFeatures_PNG -- 1 failed -- 0 succeeded
StatisticalTestFeatures_Tex -- 1 failed -- 0 succeeded
classification -- 1 failed -- 0 succeeded
config_MRI_0_sink -- 0 failed -- 1 succeeded
config_classification_sink -- 0 failed -- 1 succeeded
features_train_MRI_0_predict -- 0 failed -- 1493 succeeded
performance -- 1 failed -- 0 succeeded
segmentations_out_segmentix_train_MRI_0 -- 0 failed -- 1493 succeeded

 [WARNING]  __init__:0084 >> Not running in a production installation (branch "unknown" from installed package)
Tracing errors for sample all from sink Barchart_PNG
Located result pickle: /home/kaisar/Research/Coding/PathologyResponsePrediction/BreastMRI/WORC/BreastPCR/tmp/PCR_trial7_all_nnunet_101/classify/all/__fastr_result__.yaml

===== JOB WORC_PCR_trial7_all_nnunet_101___classify___all =====
Network: WORC_PCR_trial7_all_nnunet_101
Run: WORC_PCR_trial7_all_nnunet_101_2024-01-10T13-05-45
Node: classify
Sample index: (0)
Sample id: all
Status: JobState.execution_failed
Timestamp: 2024-01-10 13:29:30.868955
Job file: /tmp/PCR_trial7_all_nnunet_101/classify/all/__fastr_result__.yaml

----- ERRORS -----
- FastrSubprocessNotFinished: There is no information that the subprocess finished properly: appears the job crashed before the subprocess registered as finished. (/home/kaisar/miniconda3/envs/worc/lib/python3.7/site-packages/fastr/execution/executionscript.py:109)
------------------

No process information:
Cannot find process information in Job information, processing probably got killed.
If there are no other errors, this is often a result of too high memory use or
exceeding some other type of resource limit.

Output data:
{}

Status history:
2024-01-10 13:29:30.868971: JobState.created
2024-01-10 13:29:31.515629: JobState.hold
2024-01-10 16:25:21.806164: JobState.queued
2024-01-10 16:34:49.243102: JobState.running
2024-01-10 16:35:04.239567: JobState.execution_failed

It seems like it's hitting a memory limit -- If there are no other errors, this is often a result of too high memory use or exceeding some other type of resource limit. I am running WORC on a machine with 128GB of RAM.

Maybe it's related to this issue as well: https://github.com/MStarmans91/WORC/issues/79#issue-1733400283

Best regards, Kaisar