caksoylar / keymap-drawer

Visualize keymaps that use advanced features like hold-taps and combos, with automatic parsing
https://caksoylar.github.io/keymap-drawer
MIT License
746 stars 62 forks source link

Glyph download timeouts #116

Closed minusfive closed 2 weeks ago

minusfive commented 1 month ago

Have been experiencing regular timeouts / failures, seem related to the glyphs download step: https://github.com/minusfive/zmk-config/actions/runs/11321454822/job/31500930916

INFO: using config args: -c keymap-drawer/config.yaml
INFO: drawing for corneish_zen
INFO:   got extra parse args: 
INFO:   got extra draw args: 
Traceback (most recent call last):
  File "/home/runner/.local/bin/keymap", line 8, in <module>
    sys.exit(main())
  File "/home/runner/.local/lib/python3.10/site-packages/keymap_drawer/__main__.py", line 212, in main
    draw(args, config.draw_config)
  File "/home/runner/.local/lib/python3.10/site-packages/keymap_drawer/__main__.py", line 43, in draw
    drawer = KeymapDrawer(
  File "/home/runner/.local/lib/python3.10/site-packages/keymap_drawer/draw/draw.py", line 25, in __init__
    self.init_glyphs()
  File "/home/runner/.local/lib/python3.10/site-packages/keymap_drawer/draw/glyph.py", line 58, in init_glyphs
    self.name_to_svg |= self._fetch_glyphs(rest)
  File "/home/runner/.local/lib/python3.10/site-packages/keymap_drawer/draw/glyph.py", line 88, in _fetch_glyphs
    return dict(zip(names, p.map(fetch_fn, names, urls, timeout=FETCH_TIMEOUT)))
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 623, in result_iterator
    yield _result_or_cancel(fs.pop(), end_time - time.monotonic())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 460, in result
    raise TimeoutError()
concurrent.futures._base.TimeoutError
ERROR: parsing or drawing failed for corneish_zen!
Error: Process completed with exit code 1.

Started noticing a few days ago, where about 50% of jobs failed. Then it's gradually gone up until the last bunch of runs which have consistently failed: https://github.com/minusfive/zmk-config/actions/runs/11321454822/usage

Update

Locally it works fine every time, it seems. So potentially a GH specific networking issue? Or perhaps the icon lib hosts are blocking GH requests?

caksoylar commented 1 month ago

Thanks for reporting, I wonder if it is something adding retries would solve. There's no logging that says on which glyph fetch it failed, so it is hard to tell from the logs. If it isn't solved by retries, one solution I can think of is to have our own cache in GH with glyphs. Or prioritize #85, even though it'd require manual work on users.

caksoylar commented 1 month ago

Also, it looks like you use mdi, so I assume you see the errors with that source.

caksoylar commented 1 month ago

I pushed https://github.com/caksoylar/keymap-drawer/commit/bbead1c2a268c9f26e6331491e4d81a91dfdfa40, so hopefully we should be able to see which URL fetch is timing out specifically.

minusfive commented 1 month ago

Of course, now it won't fail 🤣 https://github.com/minusfive/zmk-config/actions/runs/11329959089/job/31519554766

alinelena commented 1 month ago

fails regularly for me... https://github.com/alinelena/mlego-zmk/actions/runs/11434556155/job/31808326605 but not on mdi, this time i just added simple unicode stuff... 10 of them... and already failed 3 times usually before after a restart of the workflow was happy

caksoylar commented 1 month ago

It doesn't matter what you added in the last commit, I see your config has MDI glyphs and they will be fetched everytime the workflow is run.

Maybe unpkg that we use for getting MDI is getting overloaded, or rate limiting GH APIs, or something else of the sort. I'll implement a retry logic anyway and see if that is enough to fix the issue.

caksoylar commented 1 month ago

I can reproduce the issue locally, and even observe the slowness browsing https://unpkg.com/browse/@mdi/svg. For example one SVG load took 12 seconds, another 2.8, another 2.3. I feel like increasing the timeouts and adding retries is the best we can do right now, but I'll look into caching for the workflow.

caksoylar commented 1 month ago

5d8f0ca might help but I am not closing this for now.

magicDGS commented 2 weeks ago

I am also getting the issues both locally on my Windows machine and on the GH workflow: https://github.com/magicDGS/glove80-keymap-config/actions/runs/11903571394

What is also insteresting is that the workflow doesn't fail but seems green, and thus it can also get overlooked...

minusfive commented 2 weeks ago

What is also insteresting is that the workflow doesn't fail but seems green, and thus it can also get overlooked...

You can add fail_on_error: true to the workflow job and it'll bubble up any errors and fail the build.

caksoylar commented 2 weeks ago

Given it still happens with pretty generous timeouts and retries, I think the best solution is to find an alternative place to get them, not unpkg. Or I can remove them from docs (or even default configs) and that can discourage people from using them.

magicDGS commented 2 weeks ago

@minusfive - Thanks for the hint, I will try that

@caksoylar - Sorry, I just realized that my workflow is running the released version and that one doesn't include the fix when I have checked the history. I will try today the main branch on github.

magicDGS commented 2 weeks ago

Looks fixed when using the tool from commit 0943c950114d678f9c6bdad4d16480cae8f82939 (as after that the main branch contains a defect on parsing, see #130). Then looks like the retry had fixed it, so I would be using that version from now on.

magicDGS commented 2 weeks ago

Okay, using the version from 140ca433b96ea7de639b4599cc5340f74370fdf4, the retry is still happening on github.

I have a proposal, although I am not sure if it will work: what about using directly the github raw URL for the material icons from https://github.com/Templarian/MaterialDesign-SVG, that at the end should be the same as the unpkg ones?

magicDGS commented 2 weeks ago

Quick test that worked this time (not sure if it will do always): https://github.com/magicDGS/glove80-keymap-config/actions/runs/11922628413

Edit: run was the second one without my changes, sorry, most likely it was just cached already...

magicDGS commented 2 weeks ago

This is the one https://github.com/magicDGS/glove80-keymap-config/actions/runs/11922628413/job/33229536996 that actually uses my branch (see my draft PR https://github.com/caksoylar/keymap-drawer/pull/132), but not sure now if the cache is part of the equation.

caksoylar commented 2 weeks ago

Caching shouldn't be happening on GH Actions which has a fresh container everytime. Commented on #132, thanks.

caksoylar commented 2 weeks ago

I don't even see svg under https://unpkg.com/browse/mdi/ anymore, so things definitely broke on the unpkg side. Hopefully #132 fixed this, feel free to re-open if you have issues with MDI or other glyphs.