repman-io / repman

Repman - PHP Repository Manager: packagist proxy and host for private packages
https://repman.io
MIT License
519 stars 106 forks source link

Packagist Mirror not always up to date #328

Open giggsey opened 3 years ago

giggsey commented 3 years ago

Not much detail here (sorry!), but I've noticed a few times on our work instance that the packages mirrored from Packagist don't always have up to date metadata. I don't think it's all packages, and there is no correlation between release dates of the packages

Purging the json directories forces it to download the latest versions again, but this isn't ideal.

Running latest stable via docker-compose

akondas commented 3 years ago

Yes, metadata is not being downloaded live so there may be a slight delay. At present, this is done by the cron that runs every 6 minutes. Perhaps there is a problem with it in the docker-compose :thinking:

You should be able to check cron logs in var/log/cron.log

giggsey commented 3 years ago

The delay is fine, and it does pick up newer packages. It's as if the cron fails randomly, and doesn't pick up any changes during that period. Then later runs don't look back either (I think this is because it uses the RSS feeds to detect recent changes? - Maybe a packagist fault?)

There is no cron.log

akondas commented 3 years ago

So, now docker cron will log output to var/log/cron.log #330. As for the problem itself, there are now two options: either this rss had a problem or our command in cron :thinking:

giggsey commented 3 years ago

Should there be a failsafe in case of RSS problems on Packagist? (Doesn't the RSS feed only list recent releases?)

Maybe a daily/weekly refresh of all the metadata for the mirror?

akondas commented 3 years ago

Daily/weekly refresh sounds interesting ... @karniv00l what do you think?

karniv00l commented 3 years ago

RSS feed is only for new releases but we refresh metadata fully every 6 minutes along with the provider-includes. However it's possible that download operation failed, this will not show up in logs

giggsey commented 3 years ago

Just created https://github.com/composer/packagist/issues/1176 on their side.

It looks like the URL used by repman hasn't updated since September last year.

karniv00l commented 3 years ago

I see:

  <channel>
    <title>New Releases</title>
    <description>Latest releases of all packages.</description>
    <pubDate>Tue, 06 Apr 2021 09:38:38 +0000</pubDate>
    <generator>Packagist</generator>
    <link>https://packagist.org/explore/</link>
    ...
giggsey commented 3 years ago

I think they fixed it shortly after I created the issue, but no reply yet

giggsey commented 3 years ago

This still happens frequently for us.

akondas commented 3 years ago

I would like to help solve this problem. I discovered one thing:

    public function getAsyncContents(string $url, array $headers, callable $onFulfilled): void
    {
        ($this->queue)('get', $url, $headers)
            ->then(function (ResponseInterface $response) use ($onFulfilled): void {
                $stream = $response->getBody()->detach();
                if (!is_resource($stream)) {
                    return;
                }
                $onFulfilled($stream);
            });
    }

I'm not sure, but it seems to me that if there is a mistake when downloading it will be ignored? I wonder, maybe some logger could add there.

giggsey commented 3 years ago

This might not be a packagist issue, but the cron itself.

I just attached to the cron container, and the log shows:

bash-5.0# tail -f /app/var/log/cron.log 
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8
crond: crond (busybox 1.31.1) started, log level 8

crontab is correct too.

However, the security update-db doesn't appear to have run in weeks, because I just ran it manually and it flagged a lot of packages. The previous notification I have for a vulnerable package that hasn't been fixed was 5 weeks ago when repman was last restarted.

akondas commented 3 years ago

Thanks for update. In that case, we will check the docker configuration. (cc @karniv00l)

karniv00l commented 3 years ago

@giggsey try restarting containers (docker-compose down && docker-compose up -d)

I've just checked and mine logs look good (about 1 hour span):

crond: crond (busybox 1.31.1) started, log level 8
crond: USER root pid   8 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid  21 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid  34 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid  49 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid  56 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid  63 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid  70 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid  71 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid  84 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid  91 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid  98 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 105 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid 112 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 119 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid 126 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 133 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid 140 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 147 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 148 cmd /app/bin/console repman:proxy:sync-metadata
crond: USER root pid 149 cmd /app/bin/console repman:security:update-db
crond: USER root pid 178 cmd /app/bin/console repman:proxy:sync-releases
crond: USER root pid 185 cmd /app/bin/console repman:proxy:sync-metadata
giggsey commented 3 years ago

Restarted the entire stack (down + up like you suggested), and still no luck.

Went into the container again, and can crond manually with debug level 0, still nothing.

Changed the crontab file, and it started working. It continues to work even after I revert the crontab (and restarting the cron container).

Will monitor for a while.

SiebelsTim commented 2 years ago
    public function getAsyncContents(string $url, array $headers, callable $onFulfilled): void
    {
        ($this->queue)('get', $url, $headers)
            ->then(function (ResponseInterface $response) use ($onFulfilled): void {
                $stream = $response->getBody()->detach();
                if (!is_resource($stream)) {
                    return;
                }
                $onFulfilled($stream);
            });
    }

On our installation, we always get a BufferedResponse, which does not implement detach in any usable way: https://github.com/reactphp/http/blob/1.x/src/Io/BufferedBody.php#L44

Therefore, all request are always silently skipped.

brpauwels commented 1 year ago

The issue described by @SiebelsTim can be fixed by downgrading react/http back to 1.5.0.