Open TurtleWilly opened 1 year ago
You can do the same thing in git with $Format:%cs$
where %cs
is the formatter code to embed a YYYY-MM-DD
style timestamp of the commit date (not the checkout date).
There are no tags so git describe
can't be used with any degree of accuracy.
@smarnach is this possible?
Git doesn't ship with an $id$
equivalent feature. Instead, you are encouraged to leverage SHAs generated by Git itself.
In order to embed an external information, like the SHA or any other ID, we would need to pre-process the file before being committed. This is generally the responsibility of a CI/pipeline that we don't have.
I am not inclined to add such complexity in the file itself when this is within the repo, as it would be redundant since we can leverage git.
Ideally, the tagging should happen in the pipeline that processes the list for distribution at https://publicsuffix.org/list/public_suffix_list.dat
Although these days I even question whether we still need such distribution mechanism and we shouldn't instead just rely on Git hosting.
For consumers that need/want version tagging the current solution would be to switch towards pulling the list directly from the repo. I've actually been doing it for years in the library I maintain, here's an example:
https://github.com/weppos/publicsuffix-go/commit/a20f9abcc222b049ef9b7a28845bac88e0155ae3
I believe that the .dat file instructs that it only be pulled from the publicsuffix.org url in order to utillize cdn/cloud services.
Taking note here of the value of this suggestion, I wonder if we couldn't add automation that adds a date to the file itself in plaintext within the initial header comment section when merging.
I believe this would be valuable towards Universal Acceptance.
As an example, the date would be abundantly clear to someone how stale their list is if they incorporate it in a static manner in their use or incorporation of the list.
Looking at https://github.com/publicsuffix/list/issues/1807 as an example. Whatsapp would know more clearly that they have an 8 year old copy of the PSL in use from 2015.
On Tue, Aug 1, 2023, 1:44 AM Simone Carletti @.***> wrote:
Git doesn't ship with an $id$ equivalent feature. Instead, you are encouraged to leverage SHAs generated by Git itself.
In order to embed an external information, like the SHA or any other ID, we would need to pre-process the file before being committed. This is generally the responsibility of a CI/pipeline that we don't have.
I am not inclined to add such complexity in the file itself when this is within the repo, as it would be redundant since we can leverage git.
Ideally, the tagging should happen in the pipeline that processes the list for distribution at https://publicsuffix.org/list/public_suffix_list.dat
Although these days I even question whether we still need such distribution mechanism and we shouldn't instead just rely on Git hosting.
For consumers that need/want version tagging the current solution would be to switch towards pulling the list directly from the repo. I've actually been doing it for years in the library I maintain, here's an example:
@.*** https://github.com/weppos/publicsuffix-go/commit/a20f9abcc222b049ef9b7a28845bac88e0155ae3
— Reply to this email directly, view it on GitHub https://github.com/publicsuffix/list/issues/1808#issuecomment-1659852176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQTJJZD7RC7N7YNYLOVADXTC6YTANCNFSM6AAAAAA2UNFXM4 . You are receiving this because you commented.Message ID: @.***>
Cloud Storage returns the date the list was last modified in the Last-Modified header, so anyone is free to post-process the file when downloading it via the CDN. It would also be easy to modify the deployment workflow to include the date in the file when uploading the data. From an operational point of view, I don't have any concerns about doing this, so it's up to you to make the call here, @weppos and @dnsguru. I'm happy to make the required changes if you want me to.
Git doesn't ship with an
$id$
equivalent feature. Instead, you are encouraged to leverage SHAs generated by Git itself.
I specifically pointed out that it does indeed do precisely this. It's part of the git-archive(1)
machinery, for example the thing that github uses to generate https://github.com/publicsuffix/list/archive/refs/heads/master.tar.gz
It doesn't affect git clones, although you could invoke that machinery pretty easily:
git archive HEAD <filename> | bsdtar -x -C path/to/output/directory -f -
Because the gTLD list from ICANN's JSON has a timestamp in it, and that's the most often updated element, I'd assert that "Solution Exists" if one were to track that as the last date. It does not account for deltas that occur between auto-pulls from ICANN, but due to the frequency of those, and their priority of processing ahead of subdomain projects, this works itself out relatively well.
Cloud Storage returns the date the list was last modified in the Last-Modified header, so anyone is free to post-process the file when downloading it via the CDN. It would also be easy to modify the deployment workflow to include the date in the file when uploading the data. From an operational point of view, I don't have any concerns about doing this, so it's up to you to make the call here, @weppos and @dnsguru. I'm happy to make the required changes if you want me to.
In reviewing #1855 / #1856 - in order to avoid confusion about versions of security reports that would cause further disposible volunteer resource drain in hunting, we may want to tie doing these things together:
I have seen salient arguments for doing both and also for doing neither, but it seems like datestamp would be prereq should we implement a security policy were that to proceed.
Would you be interested in an implementation of the git-archive side of this on the theory that it causes no harm to have this literal text in the file:
// this is not guaranteed to be updated, but will contain either "$Format" or else a YYYY-MM-DD timestamp
// Date updated: $Format:%cs$
and under some conditions, at least, it would be a benefit since it would actually contain:
// this is not guaranteed to be updated, but will contain either "$Format" or else a YYYY-MM-DD timestamp
// Date updated: 2023-10-02
It would be nice to have some sort of (automatic) versioning information directly inside the "public_suffix_list.dat" file. Currently it is practically impossible to determine which file is the most current from a set of multiple "public_suffix_list.dat" on disk. This probably also could be useful for libpsl to determine what the "latest" is.
With CVS or SVN we could add
// $Id$
as the first line of the file and the problem would solve itself (svn may need a propset depending on the configuration). The source control system would then automatically insert current version and/or date during the checkout (I'm not too familiar with git and if it has a similar feature or not.)