spdx / LicenseListPublisher

Tool that generates license data found in the license-list-data repository from the license-list-XML source
Apache License 2.0
11 stars 18 forks source link

Timestamp updates for unchanged data complicates usage, especially caching #136

Open aiuto opened 2 years ago

aiuto commented 2 years ago

The publisher updates timestamps of data files even if no other information has changed. See this PR for an example: https://github.com/spdx/license-list-data/commits/6552b858c0f9224be9ed38ad8e941f662a4496ab/json/licenses.json

This creates needless churn in downstream systems that cache this information. It would be more useful to have a modification timestamp.

cc: @danielmachlab

goneall commented 2 years ago

@aiuto - the licenses.json file is regenerated by the application every time the application is run. This may be causing the creation timestamp issue. I don't see an easy way to change this to be a modified timestamp. If you find a way to modify the utility to resolve this problem, please create a pull request or comment on this issue.

goneall commented 2 years ago

@aiuto - Any concern with closing this issue? It doesn't seem there is an easy solution or volunteer to code up the changes.

aiuto commented 2 years ago

The fix would be to

I would volunteer to try that, but have no time until December.

goneall commented 2 years ago

@aiuto Another possibility is to look at the timestamps or Git commit information on the input license-list-XML files to determine if they changed then just skip processing - may be simpler than maintaining the list in memory.

The Makefile in the license-list-XML repo that runs the CI running this app deletes the input files before running, so that would also need to change.

If you're willing to help out, I'll leave this issue open - ping me before you start so I can coordinate any other changes.

aiuto commented 2 years ago

using git commits tie truth to the SCM implementation.

On Mon, Oct 31, 2022, 2:19 PM goneall @.***> wrote:

@aiuto https://github.com/aiuto Another possibility is to look at the timestamps or Git commit information on the input license-list-XML files to determine if they changed then just skip processing - may be simpler than maintaining the list in memory.

The Makefile https://github.com/spdx/license-list-XML/blob/main/Makefile in the license-list-XML repo that runs the CI running this app deletes the input files before running, so that would also need to change.

If you're willing to help out, I'll leave this issue open - ping me before you start so I can coordinate any other changes.

— Reply to this email directly, view it on GitHub https://github.com/spdx/LicenseListPublisher/issues/136#issuecomment-1297487335, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXHHHHR67DDGIDN67XTGNLWGAESRANCNFSM57IXX5XA . You are receiving this because you were mentioned.Message ID: @.***>

goneall commented 1 year ago

@aiuto Just pinging you to see if you are interested in creating a fix or if I should close this issue.

aiuto commented 1 year ago

I don't have time for a fix this quarter. That doesn't mean it is not an issue, however.

goneall commented 1 year ago

I don't have time for a fix this quarter. That doesn't mean it is not an issue, however.

I'll leave it open, but mark it as "won't fix" for now. Once you have time to work on it, update the issue or ping me and I'll change it back.