conda-forge / conda-forge-ci-setup-feedstock

A conda-smithy repository for conda-forge-ci-setup.
BSD 3-Clause "New" or "Revised" License
13 stars 53 forks source link

Fix pagefile sizing #158

Closed h-vetinari closed 3 years ago

h-vetinari commented 3 years ago

As mentioned in #157, something is still not working with resetting the pagefile size - or the 4GB are not enough for some reason.

In any case, the errors with OSError: [WinError 1455] The paging file is too small for this operation to complete are still occurring in the numpy feedstock (e.g. this run for https://github.com/conda-forge/numpy-feedstock/pull/238), so add some echos to see what's happening.

CC @isuruf @mattip

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

h-vetinari commented 3 years ago

Ping @isuruf, when you have a moment. :)

h-vetinari commented 3 years ago

@isuruf, do you have any objections to merging this? We're pretty stuck with failures in the windows ci setup on the scipy-feedstock.

beckermr commented 3 years ago

I don't care either way on this pr, but you should be able to copy this package in its entirety into the feedstock and smithy can then be configured to use it. You can then make any changes you need there for debugging. This feedstock uses that feature for testing.

I'm happy to merge this too.

isuruf commented 3 years ago

You can already see the problem here.

'SetPageFileSize.ps1' is not recognized as an internal or external command, operable program or batch file.

call SetPageFileSize.ps1 is not the correct way to call a powershell script. See https://blog.danskingdom.com/allow-others-to-run-your-powershell-scripts-from-a-batch-file-they-will-love-you-for-it/

h-vetinari commented 3 years ago

You can already see the problem here.

Thanks for the tip on this - added a commit to try fixing it. It's still a separate issue from what's affecting the scipy feedstock (which continued working after #157 but now broke for other reasons, apparently).

h-vetinari commented 3 years ago

PTAL @isuruf

isuruf commented 3 years ago

This still fails in azure.

which continued working after #157

I don't think it worked given that the if condition was never triggered.

h-vetinari commented 3 years ago

This still fails in azure.

Where can I inspect that failure in the logs? I checked before the ping already and cannot find something suspicious.

which continued working after #157

I don't think it worked given that the if condition was never triggered.

I'm not saying the pagefile resize worked, I meant that the whole run_conda_forge_build_setup_win.bat broke some time well after #157. It's a separate issue that I'd like to hunt down by having better logging (which was the point of this PR, but I'm happy to fix the deficient pagefile resize en passant).

isuruf commented 3 years ago

In https://dev.azure.com/conda-forge/84710dde-1620-425b-80d0-4cf5baca359d/_apis/build/builds/357385/logs/181

'ExecuteSetPageFileSize.bat' is not recognized as an internal or external command, operable program or batch file.

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict. Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

h-vetinari commented 3 years ago

Thanks for the pointer, I was looking in the wrong section.

This still fails in azure.

Isn't this a chicken & egg problem? The CI for this PR is being set up with an old version of conda-forge-ci-setup; how can you expect a file to be there that is added to the package only a few steps later?

isuruf commented 3 years ago

Isn't this a chicken & egg problem? The CI for this PR is being set up with an old version of conda-forge-ci-setup; how can you expect a file to be there that is added to the package only a few steps later?

recipe/run_conda_forge_build_setup.bat and recipe/ExecuteSetPageFileSize.bat are in the same directory in the recipe and when installed, so it's just a matter of calling ExecuteSetPageFileSize.bat with the directory of the script run_conda_forge_build_setup.bat

h-vetinari commented 3 years ago

so it's just a matter of calling ExecuteSetPageFileSize.bat with the directory of the script run_conda_forge_build_setup.bat

Thanks, I added that and it's working now:

Calling: SetPageFileSize.ps1 -MinimumSize 8GB -MaximumSize 8GB -DiskRoot C:
[...]
(base) D:\a\1\s>PowerShell -NoProfile -ExecutionPolicy Bypass -Command "& 'D:\a\1\s\recipe\SetPageFileSize.ps1' -MinimumSize 8GB -MaximumSize 8GB -DiskRoot C:" 
PageFile: 8589934592 / 8589934592 bytes for \Device\HarddiskVolume2\pagefile.sys
isuruf commented 3 years ago

There's still an error,

Exception calling "SetPageFileSize" with "3" argument(s): "The operation completed successfully"
At D:\a\1\s\recipe\SetPageFileSize.ps1:196 char:1
+ [Util.PageFile]::SetPageFileSize($minimumSize, $maximumSize, $diskRoo ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : Win32Exception
h-vetinari commented 3 years ago

There's still an error,

I looked again at the reference - the drive letter needs quotes, which clashes with the powershell invocation that also has quotes. Hopefully the quote escapes work as intended 🤞

mattip commented 3 years ago

At what point does it become prudent to try some other way to do this? Maybe via a python script, or compiling an executable

h-vetinari commented 3 years ago

At what point does it become prudent to try some other way to do this?

Haha, you're absolutely right. But right now I only have a hammer, and am trying all the ways I can dream up of hitting that nail.

If you have a workable solution, please try it with a PR. Though I'm getting closer (haven't documented all my findings yet) - now it's just about finding the right number of quotes 😆

Parameters:
8589934592
8589934592
"C:  # <- note missing closing quote
mattip commented 3 years ago

~I have never seen a need for~ Rather than changing the pagefile size in CI, ~rather~ could tests/compilation ~should~ be adjusted to fit in available memory. Maybe we should go back to basics and ask what feedstocks are causing this to happen.

Edit: rephrase, just because I haven;t seen it doesn't mean it doesn't exist.

mattip commented 3 years ago

As for the scipy-feedstock, isn't there any other way to figure out what is going wrong?

h-vetinari commented 3 years ago

I mean, there was a version before that was working intermittently - sometimes it worked, sometimes it didn't. I foolishly thought it had something to do with the quoting, but it looks like all that is unnecessary, and the problem why it doesn't work occasionally is somewhere else.

It would be possible to merge this PR, it's still a strict improvement over the status quo (working half the time is better than working never), but that's up to @isuruf et al.

Otherwise, a separate PR to add the echos, or trying the work-around mentioned by @beckermr on the scipy-feedstock (though I don't know how).

h-vetinari commented 3 years ago

OK, this is now in the state I described above - sometimes it works -

CONDA_BLD_PATH=D:\\bld\\; Setting pagefile size to 8GB on C:
Parameters:
8589934592
8589934592
C:
PageFile: 8589934592 / 8589934592 bytes for \Device\HarddiskVolume2\pagefile.sys

and sometimes it doesn't:

CONDA_BLD_PATH=D:\\bld\\; Setting pagefile size to 8GB on C:
Parameters:
8589934592
8589934592
C:
Exception calling "SetPageFileSize" with "3" argument(s): "The operation completed successfully"
At D:\a\1\s\recipe\SetPageFileSize.ps1:197 char:1
+ [Util.PageFile]::SetPageFileSize($minimumSize, $maximumSize, $diskRoo ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : Win32Exception
h-vetinari commented 3 years ago

I'm out of ideas of how to deal with this and would just like to merge this and iterate later where necessary. cc @isuruf

mattip commented 3 years ago

The current version seems to be hanging in the windows run of scipy-feedstock. I added echo on and this is the result

mattip commented 3 years ago

Doesn't the use of the pagefile indicate a problem with the test suite? Personally I would be happier skipping a test or two in numpy instead of using the pagefile.

h-vetinari commented 3 years ago

The current version seems to be hanging in the windows run of scipy-feedstock. I added echo on and this is the result

Mea culpa, it seems #157 broke scipy after all. I hadn't considered that realistic because I hadn't seen other feedstocks fail similarly, but it seems that because scipy sets CONDA_BLD_PATH to C: specifically, it is affected by the if-condition of #157 (whereas other feedstocks are not). You could try deleting that setting (though this comes from the gfortran integration PR, so probably Isuru had good reason for https://github.com/conda-forge/scipy-feedstock/commit/2c5ea1f1753b298e1a0df109cff5c6096526e19a).

h-vetinari commented 3 years ago

Doesn't the use of the pagefile indicate a problem with the test suite? Personally I would be happier skipping a test or two in numpy instead of using the pagefile.

From the investigation in #155:

Stumbling over this discussion (regarding GH Actions), someone noted:

I’ve investigated this issue and the problem is that Windows decides to create a single page file on the D:\ drive, which has only 14 GB, and by default it allocates only 1/8th of the volume size, or about 1.75 GB. This is a very small page file by modern standards and many applications will fail.

I still think it would be good to set the page file size to be uniform. Intermittent failures are very annoying (and skipping tests is not a great remedy IMO - even just bisecting down to the exact set of problematic tests might be a substantial undertaking).

mattip commented 3 years ago

I am late to the discussion, so don't want to rehash all the good work that has been done so far. Is there a way to make the CI step to set the page file optional, document it, and have it off by default? Then at least most recipes would not need to see it.

mattip commented 3 years ago

Another idea: postpone this until conda-forge moves windows builds to github workflows, since then we could directly use this github action to set the pagesize?

h-vetinari commented 3 years ago

Is there a way to make the CI step to set the page file optional, document it, and have it off by default? Then at least most recipes would not need to see it.

I like this idea, but not sure how I'd have to go about that. Though presumably it would be possible in a similar way like CONDA_BLD_PATH can be set optionally.

Another idea: postpone this until conda-forge moves windows builds to github workflows, since then we could directly use this github action to set the pagesize?

I have no idea when that is going to materialise (actually, had no idea this is planned apparently), but I'd be surprised if anything happens in the next months - in that case I'd like to get it in earlier rather than later...

mattip commented 3 years ago

You could wrap the whole clause, from line 25 down, in an if "%SET_PAGEFILE%" NEQ "" condition

xhochy commented 3 years ago

Another idea: postpone this until conda-forge moves windows builds to github workflows, since then we could directly use this github action to set the pagesize?

I have no idea when that is going to materialise (actually, had no idea this is planned apparently), but I'd be surprised if anything happens in the next months - in that case I'd like to get it in earlier rather than later...

We will stick with Azure Pipelines, no plan for migrating to Github Actions. We don't have enough resources to build anything with Github Actions.

h-vetinari commented 3 years ago

You could wrap the whole clause, from line 25 down, in an if "%SET_PAGEFILE%" NEQ "" condition

I like this idea - WDYT @isuruf?

h-vetinari commented 3 years ago

I added a commit for the guard now; I can also open a PR that just removes the pagefile stuff completely (to unblock scipy), but I still think it's worth keeping for the problems on the numpy feedstock, and this PR would be ready.

mattip commented 3 years ago

LGTM, what else is needed to merge this?

h-vetinari commented 3 years ago

@isuruf, could you please give a go/no-go here?

h-vetinari commented 3 years ago

Thanks for your patience on this one, everyone, and apologies again for getting it wrong the first time. Fingers crossed this will now work better (I might later raise an issue on https://github.com/al-cheb/configure-pagefile-action - where the powershell script is from - to see what they think about the Exception calling "SetPageFileSize" with "3" argument(s): "The operation completed successfully")