Closed schuemie closed 1 year ago
Note: as an experiment, I deployed a new Github Action that will run R check on PLP every week, in the weekend (see fe2e747017783ad450e4f95bf835c826106b2f54). Right now, the workflow is just a vanilla R check script, I didn't include Python. Feel free to add if you think that is necessary.
Weird: the R check on PLP main itself did just pass: https://github.com/OHDSI/PatientLevelPrediction/actions/runs/3747241395
I'll run the HADES-wide check again.
PLP has a check when converting the plpData into a sparse matrix to ensure you have enough RAM. If it estimates you have insufficient RAM available, it stops with that error. So the machine running the tests must be overloaded.
Never mind. It is passing now...
This test seems to be failing intermittently. Is there maybe a way to make sure it always passes?
── Error ('test-featureEngineering.R:146'): randomForestFeatureSelection ───────
Error in `checkRam(newcovariateData, 0.9)`: Insufficient RAM
Backtrace:
▆
1. └─PatientLevelPrediction:::randomForestFeatureSelection(...) at test-featureEngineering.R:146:2
2. └─PatientLevelPrediction::toSparseM(trainData)
3. └─PatientLevelPrediction:::checkRam(newcovariateData, 0.9)
── Error ('test-featureEngineering.R:161'): featureSelection is applied on test_data ──
Error in `checkRam(newcovariateData, 0.9)`: Insufficient RAM
Backtrace:
▆
1. └─PatientLevelPrediction:::univariateFeatureSelection(...) at test-featureEngineering.R:161:2
2. └─PatientLevelPrediction::toSparseM(trainData, trainData$labels)
3. └─PatientLevelPrediction:::checkRam(newcovariateData, 0.9)
As you can see, it passed two weeks in a row, but on the 3rd week it failed: https://github.com/OHDSI/PatientLevelPrediction/actions/runs/3862053323/jobs/6583400031:
Personally I would be in favor of removing the checkRam
check. Just because "free" memory is low doesn't mean the memory is not available if a process requested it. See for example this stack overflow answer about MacOs in particular. I'll quote the relevant bit:
So in short, you actually shouldn't care if your free memory is low. In fact, you want it to be low -- free memory is wasted memory (as the OS isn't using it for anything).
So I think the checkRam
check can give an error even if you wouldn't run into out of memory issues trying to load the data into memory. I'm pretty sure this is is the case for the above failure. According to the info I can find the macos runners should have 14 GB of memory. The simulated data used for the tests should never use close to that amount, the covariates table should be about 50 MB.
So my suggestion is to not check the ram before loading the sparse data into memory, try it and let it fail if it's to big. And depending on the failure we can give an informative error message using a try catch. This assumes it won't crash the whole rsession if it fails though, but that should be easy to test beforehand.
I think the original motivation was that I would run something and it would take hours before PLP failed due to RAM issues, so I added the check to fail quicker, but it is overly sensitive.
I realized recently, that I think FE extracts all the data into RAM when it downloads the covariates, so in theory, if a computer does not have sufficient RAM it would fail when extracting the data. If the data gets downloaded, then I think the computer is good enough, but the user could have multiple R sessions (or other things) taking up the RAM. Maybe we could just have a warning about the RAM (then the user could cancel if they wanted) rather than stopping it?
If we remove the RAM check completely, we can get rid of a dependency which is good.
FE does not extract all data in RAM. It directly downloads to an Andromeda object, having only small chunks in RAM at a time.
That's what I thought, but my R crashed when I extracted covariates from a very large cohort. I was warned about the issue from others as well.
Please document that issue at the appropriate package repo, ideally with a reproducible example.
@jreps I think maybe we should start by changing the checkRam
to a warning so it doesn't break the tests. I still would prefer to remove it. I think the issue you mentioned, failing after many hours might not be relevant anymore since now most things are in try catches so if one thing fails it goes to the next. Except if the rsession is crashing I guess. But if you feel it's useful to still have the check I'd be fine with a warning instead of an error.
I still think the memuse
package might not be counting "inactive/cached/buffered" memory correctly on macos but I don't have a computer to test on.
I just tested it on my Mac and it seems it may be incorrect. Shall we just remove the check ram and just have a warning printing the estimated amount of RAM required?
Sounds good to me
I'm not sure why I haven't seen this before, but now I get this error message:
Any idea what is going on?