SynoCommunity / spkrepo

Synology Package Repository
http://spkrepo.readthedocs.org
MIT License
152 stars 26 forks source link

Improve admin webui #112

Closed mreid-tt closed 6 months ago

mreid-tt commented 8 months ago

Some admin web ui improvements

fix date formatting as well as labels for columns and filters

fix screenshot management

fix architecture management

Other admin improvements

fix app configuration (leftover from refactor #102)

fix handling of new service dependencies (fixes #39)

fix maintenance of md5 hashes

add major_version check (fixes #63)

other/environment fixes

mreid-tt commented 8 months ago

@hgy59, I was inspired by the work you did on #111 and had submitted some improvements for the admin screens. This mostly treats with labels on columns and filters, addition of new filters and fixing of date formatting.

I had a thought to fix the screenshot management as well but I couldn't make heads or tails of the code. Essentially, I think that the edit function is redundant since you can only really delete the file there. This is problematic since it throws an SQL exception as follows:

Integrity error. (psycopg2.errors.NotNullViolation) null value in column "path" of relation "screenshot" violates not-null constraint DETAIL: Failing row contains (8, 4, null). [SQL: UPDATE screenshot SET path=%(path)s WHERE screenshot.id = %(screenshot_id)s] [parameters: {'path': None, 'screenshot_id': 8}] (Background on this error at: https://sqlalche.me/e/14/gkpj) 

After the file is deleted you are left with a blank screenshot like this (second entry):

Screenshot 2024-01-01 at 3 32 40β€―PM



From there you can just delete the row. If you delete the row independently, the file isn't deleted. As such, I am proposing that we remove the edit screen altogether and add the delete function to the row deletion code.

EDIT: I found a simple solution in the Flask-Admin docs and the example code for image uploader was useful.

mreid-tt commented 8 months ago

@publicarray @hgy59, this PR should be good to merge. Let me know if there are any improvements you would recommend.

mreid-tt commented 8 months ago

@Diaoul, I was looking at the backend code as well as the database and saw that there is a field in the database build/md5 which doesn't seem to get populated when I upload a package. I see that it is referenced in the nas.py as part of a response of entries but I don't see where in the api.py that it is created.

I do however see that is is created as part of the test suite: https://github.com/SynoCommunity/spkrepo/blob/ec69e1ea3819d543b15bfc786c735c3dab86c3b0/spkrepo/tests/common.py#L281-L286

Should code similar to the above be incorporated into the api.py on package upload?

EDIT: I've included a proposed solution in https://github.com/SynoCommunity/spkrepo/pull/112/commits/688bcfd9cecd1bef886814b39ecb49db28df4269, let me know what you think.

Diaoul commented 8 months ago

Good catch! It's been like that forever πŸ˜…

mreid-tt commented 8 months ago

Good catch! It's been like that forever πŸ˜…

I've tested the upload and it writes a hash value to the database which I've verified to be correct. I was trying to check the sign and unsign functions but I don't think I have my environment setup correctly.

From the base config.py, I've set the GNUPG_PATH = "/usr/bin/gpg" and I've set the SECRET_KEY to a random string generated by base64 < /dev/urandom | head -c 64 but when I try to sign any package I am getting a "failed to sign build" error with no other log details. Can you assist?

EDIT: So I ran all the verification tests again and I got this error:

        # issue 112: fail if the specified value isn't a directory
        if gnupghome and not os.path.isdir(gnupghome):
>           raise ValueError('gnupghome should be a directory (it isn\'t): %s' % gnupghome)
E           ValueError: gnupghome should be a directory (it isn't): /usr/bin/gpg

So then I changed it to GNUPG_PATH = "/usr/bin" and ran it again. This time I got this error:

  File "/home/mreid/Documents/spkrepo/spkrepo/utils.py", line 369, in _generate_signature
    raise SPKSignError("Cannot verify timestamp")
spkrepo.exceptions.SPKSignError: Cannot verify timestamp

So, I'm not sure where to go next. Is the default value no longer valid: GNUPG_TIMESTAMP_URL = "http://timestamp.synology.com/timestamp.php"?

Diaoul commented 8 months ago

This should be a GNUPGHOME directory, not /usr/bin 😬

Any directory would work but please create a new one.

Maybe you need to create a private/public key pair I don't remember. I'd need to check the configuration on the server πŸ™‡

mreid-tt commented 8 months ago

This should be a GNUPGHOME directory, not /usr/bin 😬

Thanks for that. I eventually used the default folder at /home/mreid/.gnupg and that worked fine to interact with the default keystore.

Maybe you need to create a private/public key pair I don't remember. I'd need to check the configuration on the server πŸ™‡

While creating a private/public key pair was a useful step, this was not the root cause of the issue. The Cannot verify timestamp error was because it could not validate the signature returned from the timestamp server. To address this I had to import its public keys like this:

$ curl https://keymaker.synology.com/synokey --output synokey.tar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10240  100 10240    0     0   8169      0  0:00:01  0:00:01 --:--:--  8172
$ tar -xvf synokey.tar
VERSION
keyinfo-sys
keyring
$ gpg --import keyring 
gpg: key 38E86A2B309E88B0: public key "Synology Inc. (Timestamp)" imported
gpg: key E92D287366A13D20: public key "Synology Inc. (Official)" imported
gpg: Total number processed: 2
gpg:               imported: 2

Once those steps were completed I was able to sign and unsign packages successfully. Through this, I fixed an error with updating the md5 hash for the sign/unsign and the commit was updated. Testing looks good so far.

th0ma7 commented 8 months ago

Probably a bit out of topic but out of curiosity, using this, are you able to unsign an official Synology package as well, thus allowing to untar afterwards?

mreid-tt commented 8 months ago

Probably a bit out of topic but out of curiosity, using this, are you able to unsign an official Synology package as well, thus allowing to untar afterwards?

Alas, no. The official packages are likely encoded or encrypted in some way to prevent simple untar operations. If I am curious about a package (to learn from its wizard files for example), the most I would do is to install it and then look at the package files installed in /var/packages/[package_name].

hgy59 commented 8 months ago

@mreid-tt what is the reason to enlarge the version field?

I guess you are about to implement the feature to avoid delivering DSM 6 packages for DSM 7 requests?

mreid-tt commented 8 months ago

@mreid-tt what is the reason to enlarge the version field?

  • hopefully not to add more parts to the version (major.minor must be sufficient). In fact only the major version is relevant, the main criteria for version compatibility is the build-version.

Currently the field was of length 3 characters. I expect that this will cause a problem after Synology moves from version 9.9 to version 10.0.

I guess you are about to implement the feature to avoid delivering DSM 6 packages for DSM 7 requests?

Yes, I've made a basic implementation in https://github.com/SynoCommunity/spkrepo/pull/112/commits/e2b7f295a2d5bfb495343b4bc10d0a5006ae3f9c which works as follows:

  1. We add a minimum compatible DSM version field to the firmware table
    • So for example if we have firmware with version 7.0 or 7.1 we can set a minimum version as 7.0
  2. This will force the NAS function to omit builds in the catalog which do not meet this minimum

@hgy59, @publicarray, let me know your thoughts on this solution to #63.

EDIT: It's a bit clunky because if you choose to set a minimum version, you'll need to set it for every firmware with the same version number but it works!

hgy59 commented 8 months ago

Yes, I've made a basic implementation in e2b7f29 which works as follows:

1. We add a minimum compatible DSM version field to the firmware table

   * So for example if we have firmware with version 7.0 or 7.1 we can set a minimum version as 7.0

2. This will force the NAS function to omit builds in the catalog which do not meet this minimum

This solution makes the selection unnecessary complex.

The criteria within a Major Version is the buildnumber. I propose to first filter for the Major Version (i.e. 5, 6, 7) and than take the higest build number that is less or equal the build number of the request. This does not need additional db fields.

single caveat: for the nonarch packages we might need to create packages for DSM 5 instead of DSM 3.1 for DSM < 6.

Remarks: we must not provide packages for SRM (1.x) since there is no package center on SRM and the installation must be done manually.

mreid-tt commented 8 months ago

This solution makes the selection unnecessary complex.

The criteria within a Major Version is the buildnumber. I propose to first filter for the Major Version (i.e. 5, 6, 7) and than take the higest build number that is less or equal the build number of the request. This does not need additional db fields.

Hmm, you may be right. I'll revert the changes and start again.

single caveat: for the nonarch packages we might need to create packages for DSM 5 instead of DSM 3.1 for DSM < 6.

Remarks: we must not provide packages for SRM (1.x) since there is no package center on SRM and the installation must be done manually.

I'll see if I can have an approach that treats with both of these scenarios.

mreid-tt commented 8 months ago

@hgy59, I've implemented your proposed logic to restrict the NAS catalog to builds with the same major firmware version. This is except where the major firmware version is < 6 and the architecture is "noarch" in which case it will include "noarch" builds from major firmware versions 3 and 5.

mreid-tt commented 8 months ago

Currently the field was of length 3 characters. I expect that this will cause a problem after Synology moves from version 9.9 to version 10.0.

If we did ever want to do this in the future, I thought I'd document it here:

  1. Edit spkrepo/models.py and change the following: https://github.com/SynoCommunity/spkrepo/blob/6f8601a833cbb2168f2267d9a6b21d7fa7588ff7/spkrepo/models.py#L137 to
    version = db.Column(db.Unicode(5), nullable=False)
  2. Run the following command from the venv shell:
    (spkrepo-py3.11) $ alembic -c migrations/alembic.ini revision --autogenerate -m "Increase Firmware version length"
  3. Open the generated file (e.g. migrations/versions/26da514ce7fc_increa) and review/remove alembic comment lines
  4. Reformat the file with Python black

To have the local instance incorporate the latest migration file run the following command:

(spkrepo-py3.11) $ alembic -c migrations/alembic.ini upgrade head

To roll-back the last migration in local instance, run the following command:

(spkrepo-py3.11) $ alembic -c migrations/alembic.ini downgrade -1

Once downgraded, remove the new migration file (e.g. migrations/versions/26da514ce7fc_increa).

mreid-tt commented 8 months ago

@Diaoul, this PR should be good for final review and approval.

th0ma7 commented 8 months ago

Btw @mreid-tt nice job on this pr. These are long overdue changes that were needed, thnx.

Note that i can't review as this isn't my expertise.

mreid-tt commented 8 months ago

The title of this commit is related to download counters, but I suppose you didn't implement download counters, did you?

Yes, download counters were already present as a concept but parts of it was broken. I just fixed the broken parts so it can work.

hgy59 commented 8 months ago

Yes, download counters were already present as a concept but parts of it was broken. I just fixed the broken parts so it can work.

So if this works now, we could close #22 with this PR.

hgy59 commented 8 months ago

@mreid-tt can you validate whether #14 is still an issue - or fixed with this PR?

hgy59 commented 8 months ago

Yes, download counters were already present as a concept but parts of it was broken. I just fixed the broken parts so it can work.

looked into the code - and - oops, this might be a huge performance issue.

Package in models.py

    download_count = db.column_property(
        db.select(db.func.count(Download.id))
        .select_from(Download.__table__.join(Build).join(Version))
        .where(Version.package_id == id)
        .scalar_subquery(),
        deferred=True,
    )
    recent_download_count = db.column_property(
        db.select(db.func.count(Download.id))
        .select_from(Download.__table__.join(Build).join(Version))
        .where(
            db.and_(
                Version.package_id == id,
                Download.date >= datetime.now() - timedelta(days=90),
            )
        )
        .correlate_except(Download)
        .scalar_subquery(),
        deferred=True,
    )

can anyone look into the download table of the production system?

I suppose there are millions of records.

And each query for packages via API needs to query and evaluate the two counters for each package.

IMHO we need to redesign this At least we should periodically update a download_count for each package as new field in the package table while removing the related entries in the download table. But for the recent_download_count I have not a clue where to store it, since we do not have a related table.

@Diaoul, @publicarray is there any information about the size of the whole db and the download table in the production system available?

I am not a DB admin and may be this is peanuts for postgres, but for me is smells like bad design

mreid-tt commented 8 months ago

looked into the code - and - oops, this might be a huge performance issue.

I believe that the concern around performance was flagged in https://github.com/SynoCommunity/spkrepo/issues/17#issuecomment-400835816 previously.

can anyone look into the download table of the production system?

I suppose there are millions of records.

Since the original code was never fully implemented, the download table should have no records. The way the download links were presented to the clients in the nas.py was to have a direct download link using the path so it bypassed the counting altogether.

And each query for packages via API needs to query and evaluate the two counters for each package.

The count is actually based on the download records with two different date ranges. One for all records counted and the other for total downloads in the last 90 days.

IMHO we need to redesign this At least we should periodically update a download_count for each package as new field in the package table while removing the related entries in the download table. But for the recent_download_count I have not a clue where to store it, since we do not have a related table.

@Diaoul, @publicarray is there any information about the size of the whole db and the download table in the production system available?

You may be right that this may need a redesign. On the other hand I don't know what our rate of client interactions with the server is. If implemented I expect an initial slowdown as the server back-populates the missing md5 values but after that we may be fine.

I am not a DB admin and may be this is peanuts for postgres, but for me is smells like bad design

I'm no DB expert myself but I am sure there are optimisations that can be done as download counters and records per transactions exist everywhere on the internet so someone should have solved this already.

hgy59 commented 8 months ago

@mreid-tt I did some testing with the local db. The DSM 7 delivery works as expected. I added the following firmware versions (and assigned packages)

To limit the delivery for DSM 7.0 (any firmware below 42661) I had to add firmware version 7.0-40000.

BTW: the productive db might need some mantenance. The following (invalid) firmware version has to be removed

AFAICR that was me who accidentally added that.

mreid-tt commented 8 months ago

@hgy59 if the last few commits to fix the counter is a sticking point I can break it out into a new PR and we can hash it out there. This PR is already quite large for a review. What do you think?

hgy59 commented 8 months ago

You may be right that this may need a redesign. On the other hand I don't know what our rate of client interactions with the server is. If implemented I expect an initial slowdown as the server back-populates the missing md5 values but after that we may be fine.

The spkrepo was originally designed without manual package downloads available, and IMO it would not be an issue when only downloads via API are counted.

I don't know how often the NASes query for packages of the configured respositories (I guess at least every 48h) but I suppose that this is much more traffic than the manual user sessions. And I guess the most users install synocommunity packages via Package Center.

(with 10'000 NASes configured, we get an average of 4 queries per minute)

hgy59 commented 8 months ago

@hgy59 if the last few commits to fix the counter is a sticking point I can break it out into a new PR and we can hash it out there. This PR is already quite large for a review. What do you think?

If there are no entries created in the download table before, I recommend reverting the changes and using a dedicated PR.

mreid-tt commented 8 months ago

If there are no entries created in the download table before, I recommend reverting the changes and using a dedicated PR.

I've reverted that commit and made a new branch. It will need to be based on this branch however since it depends on the existence of the md5 implementation. I can either post as-is and rebase later or publish the PR later after this is merged in.

Diaoul commented 8 months ago

(with 10'000 NASes configured, we get an average of 4 queries per minute)

The figures are much more impressive than this: we have ~3.5M req/day that's 36 req/s. I'm not sure how many downloads but that's crazy high as well. There is no way our server could serve that load.

We rely on Fastly as CDN and have a 97% hit ratio so only 3% of that load makes it through to the server. Still, this is heavy load for it as it's not optimized at all. There might be N+1 queries hidden there (although I remember I tried to optimize for it).

The only reliable way to get download statistics would be through parsing the Fastly logs into actual data and back-filling that to our repository with scheduled jobs. I never implemented it because I found that it was not worth the effort, however, that's an interesting challenge!

mreid-tt commented 8 months ago

@Diaoul, that's impressive. When you can I'd appreciate a review of #111 and #112 so we can have these moved into production.

mreid-tt commented 8 months ago

@mreid-tt I did some testing with the local db. The DSM 7 delivery works as expected. I added the following firmware versions (and assigned packages)

  • 5.2-5644
  • 6.2-25556
  • 7.1-42661

To limit the delivery for DSM 7.0 (any firmware below 42661) I had to add firmware version 7.0-40000.

@hgy59, I believe the logic you laid out in https://github.com/SynoCommunity/spkrepo/pull/112#issuecomment-1883685947 may suffer from a flaw based on our current database. Given that we mix DSM and SRM versions in the Firmware table this can cause problems given that Synology has started overlapping build ranges. For example the latest Release Notes for SRM show build numbers reaching 9346 which exceeds DSM 6.0 in our table. In fact we currently have in the table a SRM 1.2 build 7742 which exceeds the highest DSM 6.0 build 7321. This will cause the major version that is derived by this code to be incorrect:

https://github.com/SynoCommunity/spkrepo/blob/73981ae9bf4f249e60f97bc1bb294c5cbcf96347/spkrepo/views/nas.py#L47-L52

Remarks: we must not provide packages for SRM (1.x) since there is no package center on SRM and the installation must be done manually.

Your comment above from the original logic proposal gave me an idea. I am thinking that in the Firmware table we add a field named 'type' that would be a string that would contain the family of the software ('dsm' or 'srm'). That way I could adjust the query to be something like:

# Find the closest matching firmware for the provided build 
closest_firmware = (
    Firmware.query
    .filter(Firmware.build <= build, Firmware.type == 'dsm')
    .order_by(Firmware.build.desc())
    .first()
)
hgy59 commented 8 months ago

Your comment above from the original logic proposal gave me an idea. I am thinking that in the Firmware table we add a field named 'type' that would be a string that would contain the family of the software ('dsm' or 'srm'). That way I could adjust the query to be something like:

good idea, (we might use a boolean field to enable/disable the the firmware entries for delivery by API)

mreid-tt commented 8 months ago

I've implemented the idea, let me know your thoughts.

(we might use a boolean field to enable/disable the the firmware entries for delivery by API)

I do still think we need a type which is clearer for the administrator to understand. What I can do as a creature comfort would be to tweak the admin interface to make that a dropdown selector rather than typing it in.

EDIT: Changed my mind on that last bit. Seems overly complex to implement and maintain.

EDIT: Added some input validation with regex to help keep our database clean.

mreid-tt commented 7 months ago

@Diaoul, @publicarray, could you please take a moment to review and approve this? I'm excited to move forward and get these into production. Your feedback is greatly appreciated.

mreid-tt commented 7 months ago

@Diaoul, @publicarray, could either of you find a moment to review and approve this? There have been increasing reports on Discord regarding users encountering difficulties downloading incorrect packages from the catalog.

mreid-tt commented 7 months ago

@Diaoul, @publicarray, I've noticed an increase in support requests on Discord due to the backend displaying the wrong package for download. Could you please prioritise the review and merging of this PR? It's urgent.

publicarray commented 6 months ago

Thanks I hope to have some time this weekend to look into deploying this pr

mreid-tt commented 6 months ago

Thanks I hope to have some time this weekend to look into deploying this pr

Really appreciate it! If you have time it would be great if you can include a review and merge of #111 as this also includes a number of important fixes.

publicarray commented 6 months ago

Nice work!

Running flask db upgrade gives this error:

  self.pop(exc_value)
  File "/home/seb/.cache/pypoetry/virtualenvs/spkrepo-4s3rbOEh-py3.11/lib/python3.11/site-packages/flask/ctx.py", line 261, in pop
    raise AssertionError(
AssertionError: Popped wrong app context. (<flask.ctx.AppContext object at 0x7f0425c22e10> instead of <flask.ctx.AppContext object at 0x7f0428915710>)

I'll have a look what happened here.

publicarray commented 6 months ago

At first, I got this 'NoneType' object has no attribute 'strftime'

So I imported datetime until I realised, you would get this error when modifying the confirmed_at value as it's not a string but a custom object from Flask-SQLAlchemy:

BaseModelView.index_view() got an unexpected keyword argument 'cls'

    column_formatters = {
        "confirmed_at": lambda v, c, m, p: m.confirmed_at + "foo"
    }
mreid-tt commented 6 months ago

@publicarray, I'm thinking that the 'NoneType' object has no attribute 'strftime' is because we have values in the table of None for the confirmed at. Modifying the code to this may resolve it:

column_formatters = {
    "confirmed_at": lambda v, c, m, p: m.confirmed_at.strftime("%Y-%m-%d %H:%M:%S") if m.confirmed_at else None
}
publicarray commented 6 months ago

Thanks @mreid-tt that fixed it!

mreid-tt commented 6 months ago

@publicarray, I've re-committed the change. Let me know if there is anything else I can assist with to get this PR merged.

publicarray commented 6 months ago

@mreid-tt

I've tried uploading a package freshly compiled from spkrepo am I missing something?

http --verify=no --ignore-stdin --auth XXXXX: POST http://127.0.0.1:5000/api/packages @packages/transmission_x64-7.1_4.0.5-26.spk 
HTTP/1.1 409 CONFLICT
Connection: close
Content-Length: 220
Content-Type: application/json
Date: Sun, 25 Feb 2024 10:18:09 GMT
Server: Werkzeug/3.0.1 Python/3.11.7
Vary: Cookie

{
    "message": "Conflicting architectures: apollolake, avoton, braswell, broadwell, broadwellnk, broadwellnkv2, broadwellntbap, bromolow, cedarview, denverton, epyc7002, geminilake, grantley, kvmx64, purley, r1000, v1000"
}
mreid-tt commented 6 months ago

I've tried uploading a package freshly compiled from spkrepo am I missing something?

Hmm, I don't think I modified any of the conflict testing code. Could this be an actual conflict and there is really a x64-7.1 type build already present in your server for version 4.0.5-26? Could you check what variations of Transmission already exist?

publicarray commented 6 months ago

I tried again with 7.2 and that worked. Yea it probably already existed, sorry

publicarray commented 6 months ago

I think it's all good :)

publicarray commented 6 months ago

@mreid-tt It's live

I did notice this in the logfile when I login. Doesn't seem important but maybe there is an update for passlib?

synocommunity-app-1  | (trapped) error reading bcrypt version
synocommunity-app-1  | Traceback (most recent call last):
synocommunity-app-1  |   File "/usr/local/lib/python3.11/site-packages/passlib/handlers/bcrypt.py", line 620, in _load_backend_mixin
synocommunity-app-1  |     version = _bcrypt.__about__.__version__
synocommunity-app-1  |               ^^^^^^^^^^^^^^^^^
synocommunity-app-1  | AttributeError: module 'bcrypt' has no attribute '__about__'
mreid-tt commented 6 months ago

@mreid-tt It's live

Thanks @publicarray, this is great!

I did notice this in the logfile when I login. Doesn't seem important but maybe there is an update for passlib?

Looking into this it seems that passlib 1.7.4 is the latest version and it seems like it's no longer maintained. This was reported as a known issue and there is a fork that seems to resolve the issue. Given that it appears to be just a cosmetic logging issue we should be able to leave it alone for now.

mreid-tt commented 6 months ago

BTW: the productive db might need some mantenance. The following (invalid) firmware version has to be removed

  • 7.0-4000 (the 7.0-40000 must be kept)

AFAICR that was me who accidentally added that.

As a follow-up, I have worked with @publicarray and we have removed the following firmware entries from the database:

Version Build Type
1.2 1757 srm
7.0 4000 dsm
6.2 22259 dsm
7.2 63134 dsm

These were removed as they did not align with the published versions of DSM (https://www.synology.com/en-global/releaseNote/DSM) or SRM (https://www.synology.com/en-global/releaseNote/SRM).