Ravenports / ravenadm

Administration tool for Ravenports
http://www.ravenports.com
ISC License
17 stars 3 forks source link

centralized downloader #65

Open jrmarino opened 1 year ago

jrmarino commented 1 year ago

The "lockf" approach fails occasionally, even on platforms that supposedly support it natively (FreeBSD, I'm looking at you). The lockf programs provided by ravenports for linux, solaris and netbsd "lock" up.

I'm looking to:

  1. Remove the need for a R/W null mount for the distfiles
  2. Have a centralized fetch mechanism that knows when multiple ports want the same file at the same time

The idea is that instead of fetching directly, it makes a proxy request to central downloader that will simply copy the file in the builders distfile directory from a central library if it already has it. IF it doesn't, it makes (1) download and makes the other requests wait. when done, it copies the file the builders that need it.

Now if this happens, I'll need another way to extract patches out of the builder.

jrmarino commented 1 year ago

another approach is to pre-copy existing files over to the distfile during during builder creation. We know what distfiles are expected. If the files aren't present, it would have to download first. We still have the problem of what to do when multiple builders want the same file, but it might be easier to handle here.

Bonus is that no IPC between ravenadm and builder is required for this.

jrmarino commented 1 month ago

maybe we can use a shared sqlite database here. basically the distfiles would be inserted into the database. It would have recorded which distfile is "locked" because it's downloading. If a port has multiple downloads, it could actually skip the locked download and start loading a "free" one.

ravenadm needs to delete / create the database on each run to prevent interruption corruption.

the solution has to guarantee only download at a time.

also - might be time to drop distfiles null mount but that will require precopying and postcopying logic.

another possibility is to create a small daemon that handles everything in memory and transfers distfiles from the true distfiles directory as needed.

jrmarino commented 1 month ago

maybe a specialized task could handle this. communication would be through chit requests, one chit per request. the chit files would probably be placed in /tmp/file-request.

format:

<sha256-digest>
<size>
<path>

e.g.

fdd903dc070654908191059cf566bad7dbec2f6c7d266ca5206f649ded2f54e6
368546
ravenports-ravenadm-9e2a426.tar.gz

If the file is cached, the task copies it to the local distfiles and removes the chit. otherwise the file is internally recorded and the chit is moved to /tmp/file-fetch. If the file is already record, don't do anything.

any files in /tmp/file-fetch will allow the builder's fetch to pull the file internal. If the file download files, the chit is moved to /tmp/file-fail. If the download succeeds, the chit is moved to /tmp/file-downloaded. When the task sees a file in /tmp/file-downloaded, it copies the file from the builder to the true distfiles directory and then deletes the chit.

jrmarino commented 1 month ago

or we could do what the original post says and create a true service where the service itself downloads the files and the builders don't do any downloading. I'm not sure how much progress I made today. I want this to be as robust as possible.

jrmarino commented 1 month ago

This is probably getting too complicated. This would be simpler:

  1. at ravenadm startup, delete all existing "*.lk" files. they are stale
  2. use infinite loops. Iterate through list of files
    if <md5>.lk exist, skip to the next file but add file to "ondeck" list
    if <md5>.lk does not exist, create it and attempt to download to a temporary file
    if download fails, mark the global return code as error
    if download succeeds: copy temporary file to permanent location and delete <md5>.lk
    if ondeck is not empty:
    a. copy to file queue
    b. clear ondeck
    c. sleep 5 seconds
    if ondeck is empty:  exit
  3. return global return code (0 or 1)

yes there could be a race but in case where two ports tie both will get downloaded to random files and the last one to download will overwrite the first one, but they are the same. We don't need lockf at all.