Open Korusuke opened 5 years ago
CC-ing also @satraul.
If you plan to update just the two fields, there's also his new port bump
tool available.
What I would like to see, if possible, would also be the "diff" of dependencies between the existing and the new version (I hope one could get that list from pypi?). Maybe it would be sufficient to output just some kind of log for now, since the information is already pretty unreliable.
So after thinking about how the interface between upt and backend will be like and what will go where following is what I have come up with:
Frontend parses upt -> Backend reads the existing file -> Backend extracts information to be updated (example for macports for now it can be version line and checksums line)....When I say extract means not the value but just the lines itself as we don't need the exact info as demonstrated above -> Backend generates update file using a update template -> upt generates patch file using both of this -> upt passes patch file, diff file, new file to backend -> Backend saves required files.
So in psuedo code something like: upt-core: (Sorry about the naming)
class Backend(object):
def update_package(self, path_to_file, pkg):
versions = self.package_versions(self.name)
if versions.latest == self.version:
return
old = self.read_old_file(path_to_file)
to_update = self.extract_update(old)
updated = self.extract_update(pkg)
file_diff = diff(old, pkg) # os diff call here
patch = diff(to_update, updated) # os diff call here
new = apply_patch(patch) # os patch call here
self.save_update(patch, file_diff, new)
Backend will have to implement :
extract_update()
read_old_file()
(this can be done in upt-core but I am not sure about it)
save_update
upt-core cannot handle this due to naming preference of different backend
and also this whole function can be overridden by backends in case they want to do something different, for example as mojca suggested for MacPorts we can use port bump
tool which is already doing the same job using regex and in an interactive manner.
What do you think about this?
I think the general outline looks fine; at least from my perspective, but I'll defer to @Steap for everything touching upt
itself and about what should be done in the core/backend.
Ideally, I would prefer to skip all the intermediate "extract_update" stuff and just do
but again, I don't know how feasible that is. I'll need to play around with some of the tools like diff-match-patch
a bit more (soon-ish I hope) and see how that works.
Hello,
On 2019-07-11 19:33, Karan Sheth wrote:
So after thinking about how the interface between upt and backend will be like and what will go where following is what I have come up with:
Frontend parses upt -> Backend reads the existing file -> Backend extracts information to be updated (example for macports for now it can be version line and checksums line)....When I say extract means not the value but just the lines itself as we don't need the exact info as demonstrated above -> Backend generates update file using a update template ->
So basically these are the small files that only contain (for now) "versions" and "checksums", right?
upt generates patch file using both of this -> upt passes patch file, diff file, new file to backend -> Backend saves required files.
This honestly seems overly complicated. We need to extract data from an existing Portfile, do some work "on the side", re-inject everything into the backend and have it handle a merge.
It seems difficult to implement (for MacPorts, let alone other backends) and error-prone.
Or maybe I'm misunderstanding something?
If you plan to update just the two fields, there's also his new
port bump
tool available.
What can port bump do, exactly?
What I would like to see, if possible, would also be the "diff" of dependencies between the existing and the new version (I hope one could get that list from pypi?). Maybe it would be sufficient to output just some kind of log for now, since the information is already pretty unreliable.
So, my initial idea for update was to parse CPAN/PyPI/Rubygems/etc. for 2 different versions and generate a high-level diff ("these are the new requirements", "these requirements are no longer needed", etc.). At first, it seemed doable for PyPI:
https://pypi.org/pypi/requests/0.10.0/json <- an old version https://pypi.org/pypi/requests/json <- the current version
It seemed doable for CPAN as well:
https://fastapi.metacpan.org/v1/release/NEILB/AppConfig-1.70 <- an old version https://fastapi.metacpan.org/v1/release/AppConfig <- the current version
But I do not think Rubygems (and other frontends to be added in the future) will provide info about all versions of a package.
Also, it should be noted that some info is missing from these REST API. For instance, the test requirements for "requests" do not show up on PyPI (for various reasons that I won't go into here).
Wouldn't the best way to update just be to create a "Portfile.new" file, and then have the user run diff/git diff/whatever?
port bump
basically does just the following (for any given port):
I wrote this and forgot to hit send before I read your replies:
I just want to reiterate: updating just port version and checksums is something that already port bump
has done (testing and feedback greatly appreciated), and is relatively easy to do in general, but of course you miss a whole bunch of stuff.
What I would really like to see is at least some rudimentary support to tell you which dependencies changed between the two versions. I don't mind as much if I need to add those manually for a moment, but figuring out the exact dependencies is really the most time-consuming step which I very strictly :) always ignore when I need to do some mass update. We should be able to extract both old and new dependencies from the two versions, compare them, and at least tell the user:
But I do not think Rubygems (and other frontends to be added in the future) will provide info about all versions of a package.
The v2 of rubygems api seems to support this: https://rubygems.org/api/v2/rubygems/acts_as_list/versions/0.9.18.json https://rubygems.org/api/v2/rubygems/acts_as_list/versions/0.9.19.json
But yeah you are right maybe all frontends doesn't support this.
I agree that the whole process is a bit complicated so maybe we can divide that into two parts and do the following:
Now optionally backends can implement apply_update feature which is backend specific:
port bump
for now.Is this fine?
I agree that Cyril initial idea to "generate a high-level diff" would be a good one, but indeed probably not doable for all frontends. Generating a new Portfile can and should be done for sure, in addition to making a diff between the two files; no problem there.
I also like Karan's idea of implementing an optional "apply_update" function (or whatever name) in the backend to do something more specific. I am not a fan of doing port bump
for the MacPorts backend, that might for now work do update the version and checksums but is not scalable to anything more complicated for dependencies and such. So I think we need to come up with a way to "swap" the information into the new Portfile and should try to get that to work for simple things like "version/checksums" first and once that works it should be possible to extent.
Again, I don't like making separate intermediate diff files to patch the Portfile. This should be something people have done before and I would suggest you to investigate things like diff-match-patch
or other approaches already being used.
@Steap can we first agree on how/what should be implemented in upt
and what should be done in the backend and take it from there?
So, here is an updated (pun indented) idea of how to do things:
1) We get the current version of PKG using the recently introduced backend.package_versions() method. The current version is probably the "highest" version of the returned list.
2) We ask the frontend to parse both the "old version" and the "latest version".
3) We create a "package diff":
class PackageDiff(object):
def __init__(self, old_pkg:upt.Package, new_pkg:upt.Package):
self.version = new_pkg.version
def new_requirements(self):
pass
def deleted_requirements(self):
pass
upt.PackageDiff.new_requirements would be similar to upt.Package.requirements, but would only return the new requirements.
TODO: figure out what else we might want to access from a PackageDiff.
4) upt.Backend has a "create_package" method, it should have an update package as well.
class MacPortsBackend(upt.Backend):
def update_package(self, pkgdiff:upt.PackageDiff):
...
Now, it's up to the backend to do the right thing. Can the backend easily make the required modifications? If not, maybe the abstraction I've proposed is not really good.
Sounds great, I have a few doubts(not even sure they are doubts)...so I will first code this up and if anything goes wrong then I will ask.
Ok so all of what @Steap suggested is done more or less...now before coming to upt-macports. Till now I haven't used file_path input which we had discussed to take as input for --update.
For upt-macports backend, this are the things I think we will do:
Anything else that you think we should be doing?
@Steap I fully agree with your latest proposal! I don't understand though how this differs from you first "high-level diff" that seemed not supported by all the frontends. Anyway, if you think this will work I think it's a great way of doing this, so let's go for it!
@Korusuke as I said earlier I would prefer not to do port bump
on the existing Portfile. This will only work for version and checksums and not for anything else. This is a major, and very important, part of your GSoC project for MacPorts and if it takes time to find a way to do this, this is time well spent.
Of course, we could decide to finalize the pieces needed for upt
and start with what you suggested above as an intermediate solution. Sometimes the drawback of such an approach is though that the ideal solution might never get implemented...
Ok my bad, I somehow tested diff match patch wrongly or missed out..... but testing again with all the output that current idea is giving, it works unbelievably amazingly well :) Will push the code tomorrow.
@reneeotten Yes, this is the same "high-level diff" approach. If some frontends do not give us access to all versions of a package, well, too bad. We could always ask the backend to parse their own files if needed, by the way.
Maybe we could improve "port bump" instead of implementing everything in upt-macports? I'm not sure exactly how "port bump" works, though.
Maybe we could improve "port bump" instead of implementing everything in upt-macports? I'm not sure exactly how "port bump" works, though.
Perhaps, but it's written in Tcl, which is pretty much voodoo to me ;) Looking at the man-page it will only update the checksums and reset the revision, after updating the version number of a port by the maintainer.
If we first need to update the version number and then run port bump
I see no use-case for this when using upt
, it appears to me that we have all the required information ourselves already.
If we first need to update the version number and then run
port bump
I see no use-case for this when usingupt
, it appears to me that we have all the required information ourselves already.
Hi all. I haven't been following the discussion closely, but it is already planned for me to implement a --livecheck
argument for port bump
so the version number is updated automatically to the latest version before bumping. Issue here: https://github.com/satraul/macports-base/issues/4
Is this the use case needed? I can prioritize working on this feature if so.
Maybe we could improve "port bump" instead of implementing everything in upt-macports? I'm not sure exactly how "port bump" works, though.
Perhaps, but it's written in Tcl, which is pretty much voodoo to me ;)
We have the GSOC student who implement this as a challenge task during proposal submissions (@satraul) and could improve the tool if needed.
Looking at the man-page it will only update the checksums and reset the revision, after updating the version number of a port by the maintainer.
True.
If we first need to update the version number and then run
port bump
I see no use-case for this when usingupt
, it appears to me that we have all the required information ourselves already.
True.
What port bump
does is in fact merely replacing checksums, resetting revision (I forgot about that part) and updating version (I thought this was already implemented, but according to @satraul's response maybe not yet, even though it would be doable).
Maybe my message was misunderstood before / I was not clear.
What port bump
does is trivial to replicate with upt
at this stage, while at the same time port bump
has absolutely no chance of knowing what dependencies changed. And I consider the latter (at least notifying the user what dependencies need to be added or removed) as one of the most important goals of the upt
project. So all in all I see no reason for upt
to depend on port bump
, it's easier and far more flexible to implement everything that needs to be done in python.
port bump
remains a very useful tool for other ports with no external meta information available. For python/ruby/perl we should aim higher though.
@satraul: yes, updating version would be nice, or support something like port bump maintainer:mojca
:), but there is no need to re-prioritise your tasks. Just concentrate on your main task for now and you may do this when you need a short break from fighting with xcode, or once your major milestone is achieved.
What would also be cool as the "last step" for both tools (upt
and port bump
) would be the ability to automatically open pull requests, but that requires a bit more work, and in any case if that ever gets implemented, it should be shared for both tools.
port bump
remains a very useful tool for other ports with no external meta information available. For python/ruby/perl we should aim higher though.
Agreed that it's useful for other ports, but as I've been trying to say for a while now it should not be used here. We have the same information (and much more) easily available from upt
and should implement a proper solution. But again, it sounds like @Korusuke has possibly found such a solution already using diff-match-patch
so let's see how that goes!
We can then figure out which part of the code should go (as default option for example) into upt
and what is backend-specific.
I would like to see, at least in the testing phase, the following:
upt
as if it was a new package (--> Portfile.upt)Version for perl
ports in macports doesn't match that of upstream, if the upstream version is 1.41
then version in macports is 1.410.0
.
This is consistent for atleast the random 10 ports that I checked, so I guess we should just handle it in upt-macports, or is it something that will be fixed in macports itself ?
Edit: Nope I was wrong, it's weird as in Portfile the version is right but port info --version
returns wrong version and that too the returned version is inconsistent(what I have described above is wrong for some ports).
App-rad: version is 1.05 port info gives 1.50.0
So I guess we should fix port info first as this is important for both update as well as recursive feature to work.
What do you think ?
Edited a bunch of stuff in previous comment and as the github mail won't contain that so ping :) @reneeotten @mojca @Steap
Version for
perl
ports in macports doesn't match that of upstream, if the upstream version is1.41
then version in macports is1.410.0
.
Version numbers in perl are... interesting. The observation above is only true for upstream modules that use a decimal number for their version. Our version comparison doesn't support such numbers, e.g. 1.10 is considered bigger than 1.2. The perl5.setup proc transforms the version in a way that should match what is done by perl -Mversion -e 'print version->parse("SOMEVERSION")->normal'
.
Further reading: https://metacpan.org/pod/version
thanks for the explanation @jmroot
In that case we should do the same transformation of the upstream version in upt_macports.py
before doing checking whether the correct version is already packaged in MacPorts or if there is a newer version available upstream. @Korusuke let's transform the proc perl5_convert_version
in "_resources/port1.0/group/perl5-1.0.tcl" into a Python function.
thanks for the explanation @jmroot
In that case we should do the same transformation of the upstream version in
upt_macports.py
before doing checking whether the correct version is already packaged in MacPorts or if there is a newer version available upstream. @Korusuke let's transform theproc perl5_convert_version
in "_resources/port1.0/group/perl5-1.0.tcl" into a Python function.
The check about version checking actually takes place in upt-core and not in upt-macports which is only responsible of returning the current version. So we actually need to apply inverse transformation and return that to upt-core.
For doing this although I was highly skeptic about it working, I replicated the existing proc in python and then tried to invert it which "kind of wokrs" but not always. Consider the following cases: -Upstream version: 1.23 -Parsed version: 1.230.0 -Inverse version(my script): 1.23
-Upstream version: 1.2.3 -Parsed version: 1.2.3 -Inverse version(my script): 1.002003
So basically if the original upstream version has two decimal points then we don't have to apply any transformation.
I am kinda stuck and don't think there's any way to correct this script. But here are potential solution's:
Somehow get the upstream version from macports itself as the version is present in the portfile afterall.
Overwrite needs_requirement()
, this way we can apply the normal transformation and that way it would work perfectly, here we can also change the log statements to better suit macports as @reneeotten once mentioned.
@Korusuke I looked a bit more today and I tend to agree with you that reverse engineering the upstream package version is difficult to do correctly. For example, all these "Portfile" versions [1.1, 1.10, 1.100] would give the same "Perl converted" version (v1.100.0) so it's pretty much impossible to get back to the upstream version from port info --version <perl-port>
.
Ideally, I prefer not overwrite the needs_requirement()
function that is present in upt
, unless really necessary. Additionally, I suspect that other backends could have similar issues when comparing Perl versions (especially if a package switches between versions with/without the v
).
It seems to me that the recommended way of comparing Perl package versions is to first convert them as described in the link that @jmroot gave above. If that's the case, what about putting that logic in upt
? In other words, why don't we convert both the upstream version as well as the version(s) returned by the backend and then do the comparison? That would in my opinion solve the issue, correct? The slightly annoying thing is then that there would be some cpan
specific code in there, which might not be ideal... @Steap what do you think?
I am horrified. I really wonder how Perl ended up with something this complicated just to specify a version.
I do not really understand why needs_requirement would have anything to do with this. Can you tell me more about what you had in mind?
I do not really understand why needs_requirement would have anything to do with this. Can you tell me more about what you had in mind?
Sure, there might be other/better ways and it's possible that I'm far off here, but.... my thought process this afternoon and how needs_requirement
would fit in here is below:
For example, upstream has its version set to “1.04” so this is what upt
will get and also what will be in the MacPorts portfile. According to the recommended Perl conversion (and also what port info --version gives) this version number will become 1.40.0. (We skip the v
in MacPorts versions, but that's fine because if you run the conversion again it gives the same result with the v
).
Now, let’s assume this package in MacPorts becomes outdated and version “1.05” is released and we attempt to package another package that requires this version. Currently, in upt_macports.py
in package_version()
we return the version that is present in MacPorts (using port info --version <portname>
). So we return that we have version “1.40.0” packaged and only version 1.0.5 is required by the thing we are attempting to package. So upt
will think everything is good in the packaging-world when executing needs_requirement()
, whereas in reality the version we have is not sufficient (this will happen now already with the "recursive feature", but certainly also when deciding a package needs to be updated). It's possible that other backends specify their versions exactly as upstream and as an alternative solution we could just extract the upstream version from the Portfile and return that instead.
However, it seems to me that similar issues can appear as well if requirements and upstream versions are expressed differently. From the link mentioned above: “If you need to compare version numbers, but can't be sure whether they are expressed as numbers, strings, v-strings or version objects, then you should use version.pm to parse them all into objects for comparison.”
So what I am proposing is to add a Python function to upt
that does the recommended conversion (i.e., gives the same output as “perl -Mversion -e 'print version->parse(package_version()
and the requirement specifier in needs_requirement()
in upt
.
In other words, in pseudo-code something along those lines:
def standardize_perl_version(version):
“”” return the same output as “perl -Mversion -e 'print version>parse(version)->normal’) “””
std_perl_ver = <some magic code to do the conversion>
return std_perl_ver
in the needs_requirement()
function:
if frontend == ‘cpan’:
versions = standardize_perl_version(self.package_versions(req.name))
else:
versions = self.package_versions(req.name)
and a little further down:
if frontend == ‘cpan’:
s = SpecifierSet(standardize_perl_version(req.specifier))
else:
s = SpecifierSet(req.specifier)
compatible_versions = list(s.filter(versions))
Apologies for the very long post.... but it seems to me this solves the issue for MacPorts and at the same time will work in a backwards-compatible way for all backends. As an additional bonus it should help for all backends when version are expressed differently (as described above). Let me know what you think and I am certainly open for other, simpler solutions ;)
So this is also an issue for recursive packaging, right?
Rather than having a bunch of if/else in needs_requirement, one solution would be to "fix" upt-cpan, and always use "normalized" versions. I'm not sure exactly what other backends expect, so I'm a bit reluctant to make this change right now.
A workaround for macports would be to override need_requirement in this way:
def need_requirement(self, req, phase):
new_req = ... # modify the requirement to have a "standardized" version" if needed
super().need_requirement(new_req, phase)
This way you only implement the "magic" conversion (which may or may not be moved to upt-cpan later) and still take advantage of the regular need_requirement version.
What do you think?
@Steap long-term it might be indeed worthwhile to move to the normalized/standardized version in upt-cpan
but for now that will likely have a too big impact on other backends.
Your suggestions is a very good one, I should have thought about using super()
... So we extract the value of the version from req.specifier
, do the MacPorts/Perl-normalization on it and put it back (keeping the "<", ">", "==", .... elements intact). For frontends other than cpan
, we don't need to do anything. Yeah, that should work I think (except that you're missing the s
in the function name --> ``needs_requirement```).
@Korusuke does the discussion above and the suggested solution makes sense to you - if so, can you give this a try. I think we should implement this in a separate PR; if we can resolve this for the already merged "recursive packaging feature", it should also work for the "update feature".
So after trying a bunch of things like diff-match-patch, normal diff and vaguely trying regex following is the technique that works best at least for situations that I tried.
Let's say we have a Portfile which we need to update and we have decided that we will update
version
andchecksums
field. So first we extract that fields from existing Portfile (yet to be automated). and make a file which looks like this:and then we generate a new file which looks like this:
Notice the blank line at end of file, that is apparently important for this to work.
Now we generate a diff for above two files and then apply patch on the original file. This even updates a file that maybe modified and has different order of fields than what upt usually has.
For example from this:
to this: