bblanchon / pdfium-binaries

📰 Binary distribution of PDFium
789 stars 166 forks source link

Reduce conda release frequency? #146

Closed mara004 closed 5 months ago

mara004 commented 5 months ago

anaconda.org has a storage limit of 3 GiB, which seems fairly narrow. (You can check status in the Storage tab of your account settings.) In that light, maybe weekly releases are too frequent? Would it be possible to publish the conda packages only every other week, or once a month, so we don't need to be afraid of running into the limit?

bblanchon commented 5 months ago

Lowering the frequency would only postpone the issue. I think we should discard old versions instead.

mara004 commented 5 months ago

I think we should discard old versions instead.

Sounds dangerous? I believe this would threaten significant breakage in pypdfium2, like breaking an entire older major version of the helpers, due to the version pinning and bounding that is necessary to achieve inherent ABI/API safety. (e.g. PyPI has a big warning against deleting older version.)

I don't say it's not a possibility if need arises, but we'd have to be very careful in what versions to delete, and it would perhaps still risk breaking some dependants of pypdfium2.

mara004 commented 5 months ago

Here's some calculation:

I believe the reason why package sites are imposing such limits is just to send a signal that people should avoid too frequent releases and try to keep packages small.

mara004 commented 5 months ago

@bblanchon I was wondering if you have reached a resolution on this yet? Would you be willing to take a PR to run conda publishing only once per month? It should not affect the other builds, of course.

I believe this would be fairly relevant to achieve a sustainable packaging concept.

bblanchon commented 5 months ago

I didn't do anything yet because:

  1. There is no rush
  2. I'm tired of all the extra work for Anaconda
  3. Should we do a monthly release for NuGet too
  4. What would people say (hint: people rarely thank) if versions in NuGet/Anaconda differ from the ones on GitHub?
  5. It's possible that Ananconda changes the limits before we reach them.
  6. They could also make an exception for us.
mara004 commented 5 months ago

(4a) First of all, of course I really do appreciate the effort you put in this project. 🤗

(2) I only understand too well you're tired of conda (hint: I am too), which is why I would like to finally get over with this matter by doing the final touch to get this into a sustainable state.

(1) I know this does not cause any immediate issues, but we also need to think forward. It's a bit of a problem that we're currently consuming space at 4x speed compared to what would seem sustainable. Any other project can now depend on specific ranges of published releases, so deleting things generally risks downstream breakage. Figuring out later which releases can be deleted "safely" and which can't would be tedious and still prone to oversights.

(3) I don't know what NuGet's limit is, it might be high enough to be OK for weekly builds. We'd need to talk about this with the people who are using NuGet. (CC @sungaila)

(4b) In this case it would follow a clear rationale as outlined by people who depend on the packages. I, and implicitly pypdfium2's downstreams, would be thankful indeed for differing here because it would fix a flaw in our packaging logic.

(5, 6). IMHO it's good practice to lay out things to work given current boundaries and not rely on an uncertain relaxation of limits that might never happen. Otherwise, we'd need an early confirmation that they're willing to make an exception for us. FWIW I bet they would say something like "why don't you just lower the release pace, weekly is quite frequent according to conda customs" 😅

mara004 commented 5 months ago

On the other hand, should pdfium ever decide to get rid of AGG and force Skia on embedders, we'd be in space trouble anyway, because Skia is big. Although they actively pursue Skia, it seems like even Google itself has binary size concerns, though.

sungaila commented 5 months ago

Personally I don't need weekly NuGet releases. I update the PDFium libs each time I release a new version which happens maybe every few months. You can tell which versions I had pinned by the download counter of the Win32 packages. 😄

@bblanchon Your work is very much appreciated! I really like your friendly yet productive attitude when discussing issues and pull request.

mara004 commented 5 months ago

It's a bit vague, but https://github.com/NuGet/Home/issues/6208#issuecomment-346171972 indicates to me that there might just be no disk usage limit on NuGet? @sungaila Do you have any definite information on this?

bblanchon commented 5 months ago

Alright, I did it. The workflow now checks the day of the month to release the Anaconda package only once a month.

mara004 commented 5 months ago

Thanks for the surprise!

FWIW pypdfium2 does not depend on any releases < 120.0.6097.0, and I doubt if anyone else does, so I wouldn't object to deleting these if needed or desired.

mara004 commented 5 months ago

In theory we could also consider excluding platforms where there will probably be no actual demand on conda (win-32, linux-32, linux-armv7l ?), but I don't want to trouble you any further. Just an idea e.g. in case pdfium gets bigger or something.

bblanchon commented 5 months ago

Indeed, their download page doesn't offer any 32-bit option.