Open cchwala opened 1 year ago
This is a bit strange, the new transformer code should not use that much more memory. Worst case if we do not find a way to solve this we could use the old transformation code, that is now working, do restructure the data. As mentioned in #62 it could be just the binder environment. One think to check is if both transform_andersson_2022_OpenMRG and transform_andersson_2022_OpenMRG_linkbylink crashes for the same environment.
Since it comes from the extraction process it looks to me like a disk space issue, which in the case of a binder pod is most likely a disk quota issue.
I just ran it again after your latest commit in #62 and it worked without this error... Like 1 hour ago I was on a small binder pod from ovh with 2GB of RAM. Maybe the available disk space depends on how many other pods run on the same server or on how much the other uses have already used up disk space...
Maybe this problem will not occur often enough to work on a solution. If so, we could just add an info text somewhere.
The resources of the binder pods are limited. Running the data exploration notebook with the current version from #62 I get the following error this error message:
I am not sure if this is due to changes in the function
transform_andersson_2022_OpenMRG
, maybe introduced in #62, but since the error comes fromzfile.extractall(path_to_extract_to)
it seems to just stem from the fact that we exceed a disk quota set on the binder pod, which I do not know how to find out how large the quota is.