actions / setup-java

Set up your GitHub Actions workflow with a specific version of Java
MIT License
1.54k stars 743 forks source link

Fallbacks when jdk cannot be downloaded #333

Closed ben-manes closed 2 years ago

ben-manes commented 2 years ago

Description: Sometimes the JDK cannot be retrieved. In this execution, Cloudflare responded with a Please Wait... html payload which the action did not understand.

It would be nice if (a) a retry approach was implemented, (b) if fallback distributions could be set. If the execution allows for any compatible OpenJDK distribution, then this failure could be mitigated.

Justification: Execution failures due to external systems, where a retry and fallback policy could allow the action to recover and not cause noise.

e-korolevskii commented 2 years ago

Hello @ben-manes, Thank you very much for your request! We are considering your feature request and will inform you about it. Thank you for your patience.

ben-manes commented 2 years ago

I switched to temurin to take advantage of the tool-cache for jdk11 and for jdk18 that it is hosted on github packages (so maybe fewer cdn quirks). That was close but not enough, so I wrapped it using wretry.action like below. Unfortunately that skips the built-in caching due to being "loaded as a module", so that has to be done separately (e.g. via gradle-build-action). This might be good enough, if you prefer to document that as a workaround.

- name: Set up JDK ${{ matrix.java }}
  uses: Wandalen/wretry.action@v1.0.20
  with:
    action: actions/setup-java@v3
    with: |
      distribution: temurin
      java-version: ${{ matrix.java }}
    attempt_limit: 3
    attempt_delay: 2000
e-korolevskii commented 2 years ago

Hello @ben-manes,

Thanks for suggesting this workaround. After investigation, we decided that the problem was in the occurrence of captcha (you can learn more about this in issue #334) and as giltene answered - they solved this problem on their side. We will keep an eye at the situation and investigate the possibility of implementing this feature.

ben-manes commented 2 years ago

@e-korolevskii,

I ran into some network stability problems with tamerin so wretry made sense. It might be due to the high concurrency of my build triggering ddos protection, as breaking it into 117 jobs (at 20x parallel in free tier) reduced the build time by 2hrs. Using wretry seems to have stabilized things.

e-korolevskii commented 2 years ago

Good evening, @ben-manes We will try to reproduce this error with the network and explore the possibilities of solving this problem.

e-korolevskii commented 2 years ago

Hey @ben-manes,

After investigation we came to the conclusion that the problem was caused by the Harden Runner. Because one of the polled addresses was not on the allowed list. A simple solution in this situation, we see adding the address 54.185.253.63:443 to the list of allowed endpoints. Does this solve your problem?

ben-manes commented 2 years ago

great, thanks @e-korolevskii!

Note that Harden Runner doesn't list that IP as blocked, but I am fine assuming that it did block and was unlisted.

Error: StepSecurity Harden Runner: DNS resolution for domain blob.blz21prdstrz04a.store.core.windows.net. was blocked. This domain is not in the list of allowed-endpoints. Error: StepSecurity Harden Runner: DNS resolution for domain blob.blz21prdstrz04a.trafficmanager.net. was blocked. This domain is not in the list of allowed-endpoints. Error: StepSecurity Harden Runner: Traffic to IP Address 20.150.90.196 was blocked

e-korolevskii commented 2 years ago

Hello @ben-manes,

Thanks for the note, it might be very helpful for other users. For now, I'll close this issue as resolved. If you have additional questions - feel free to ask them here or create a separate issue.

Have a good day!