sonatype-nexus-community / nexus-repository-r

R, v data science, much functional programming, doge
Eclipse Public License 1.0
31 stars 17 forks source link

NEXUS-20698: Add *.rds file support for R proxy #79

Closed maksyche closed 5 years ago

maksyche commented 5 years ago

https://issues.sonatype.org/browse/NEXUS-20698

Added support for .rds metadata files in proxy and group repos as requested in this issue.

Now installing old versions of packages from archive works as it should

maksyche commented 5 years ago

Question: Why have we changed from PACKAGES to METADATA? Do we need to consider existing databases that will already be marked incorrectly asPACKAGES`?

Before this pull request R format supported only PACKAGES.gz metadata. But there are plenty of other metadata such as archive.rds or current.rds which support I added here, so I renamed asset kind to fit them as well. In existing database there is only one metadata type so assets are marked correctly if it's just a mark and it doesn't affect any other logic that I'm not aware of.

doddi commented 5 years ago

Question: Why have we changed from PACKAGES to METADATA? Do we need to consider existing databases that will already be marked incorrectly asPACKAGES`?

Before this pull request R format supported only PACKAGES.gz metadata. But there are plenty of other metadata such as archive.rds or current.rds which support I added here, so I renamed asset kind to fit them as well. In existing database there is only one metadata type so assets are marked correctly if it's just a mark and it doesn't affect any other logic that I'm not aware of.

If an existing nxrm already has R community of a provious version with populated assets then any comparisons in code will be looking for the new METADATA but existing assets would be stored as PACKAGES. I am thinking this is not an issue because we are not supporting any previous versions of R. However, it seems like a breaking changing when it doesnt have to be.

maksyche commented 5 years ago

Question: Why have we changed from PACKAGES to METADATA? Do we need to consider existing databases that will already be marked incorrectly asPACKAGES`?

Before this pull request R format supported only PACKAGES.gz metadata. But there are plenty of other metadata such as archive.rds or current.rds which support I added here, so I renamed asset kind to fit them as well. In existing database there is only one metadata type so assets are marked correctly if it's just a mark and it doesn't affect any other logic that I'm not aware of.

If an existing nxrm already has R community of a provious version with populated assets then any comparisons in code will be looking for the new METADATA but existing assets would be stored as PACKAGES. I am thinking this is not an issue because we are not supporting any previous versions of R. However, it seems like a breaking changing when it doesnt have to be.

In this request I'm adding support of different metadata. It seems to be a kind of breaking changing to me. Leaving asset kind name PACKAGES and add another one for all other metadata is one of the ways. But PACKAGES.rds, which is also a package file but with another extension and compression, will be marked as METADATA (cause I have generic routing for .rds files) and I think it will be confusing. Creating routings for all known packages seems to be overkill to me. Let me know if you have better option.

maksyche commented 5 years ago

@doddi I split .rds metadata and PACKAGES into different asset kinds and attached them to different routings. Please, review changes here: 338fa1831974c5a79d2d7e4a69d5a5f0c45e894b

maksyche commented 5 years ago

Maybe not for this PR but I noticed that hosted basically allows anything to be uploaded. This seems dangerous

Yep, I'll discuss it with the team and create ticket if it doesn't exist.

maksyche commented 5 years ago

lgtm - I would like the upload * addressing before release

Created ticket for it