hlxsites / prisma-cloud-docs-website

blocks and gdoc authored content for https://docs.prismacloud.io
Apache License 2.0
3 stars 2 forks source link

Last updated time doesn't align #65

Closed iansk closed 1 year ago

iansk commented 1 year ago

The last updated time displayed on the site doesn't align with the actual time that the underlying file was updated.

  1. Go to this page.

  2. The page says the file was last updated on June 12, 2023.

    last-updated-site
  3. Click on "Edit on GitHub" to view the source file.

  4. In GitHub, click "History". Notice that the file was updated on May 4, 2023.

    last-updated-github

Results

The currently reported time for the last update isn't accurate. The site reports Jun 12, when the actual last update was May 4.

Expected results

Last updated time aligns with what's reported in GitHub.

maxakuru commented 1 year ago

@iansk that is the last published date for the document, as reported by Franklin via the last-modified header

maxakuru commented 1 year ago

if you take a look at the data from github raw, you'll see the data available at runtime: https://raw.githubusercontent.com/hlxsites/prisma-cloud-docs/main/docs/examples/test.adoc

there's no "last commit" date or sha, in fact there's no last-modified date.. if there were, we could insert data into the markup, but the current date is based on the "last publish" date from Franklin. Worth noting that that date will eventually be accurate, but not until all documents are published in the expected format and only republished on changes.

@iansk wdyt? is that ok or is last commit date a hard req? it would be possible to add, but would require a lookup table in the worker to track all files -> commit date and would significantly increase the worker bundle size and moderately increate the build complexity

iansk commented 1 year ago

@maxakuru Does that mean last-modified for a file is derived from the time we merge a branch into main?

For example, if a branch only changes one file test1.adoc, then last-modified for test1.adoc would be the time I merge it into main, right? And then, all other files in the repo would retain their own (different) last-modified times from the last time they were merged into main, right?

maxakuru commented 1 year ago

sort of, it means that it's completely unrelated to the merge/push/commit time on git and only determined by Franklin's publish timestamp.. we have the ci job that does that publish on merge to main, but that's only 1 way that it can be triggered

you can see the timestamp for the modified using the admin status API, for example: https://admin.hlx.page/status/hlxsites/prisma-cloud-docs/main/prisma/prisma-cloud/docs/en/compute/pcee/admin-guide/access-control/access-control

iansk commented 1 year ago

@maxakuru That makes sense, and it works for us.

Just to confirm the logic:

  1. Any push to main runs the Publish changed docs GitHub workflow.
  2. This workflow collects a list of changed files (git diff --name-only --diff-filter=ACMRT...), and then runs a "Franklin publish" on each file.
  3. Franklin keeps track of the publish time for each file.
  4. When a page is rendered/displayed, you get the value for "Last Updated" from live.lastModified from Franklin's admin status API.
maxakuru commented 1 year ago

@iansk mostly yes - the only difference from your description is that we use the last-modified header.. we don't need an additional API call to get the timestamp since we're pulling the article content using fetch and have access to the headers: https://github.com/hlxsites/prisma-cloud-docs-website/blob/3b362d7e3c1733cb1d4255892b961d1d8f5dc904/prisma/prisma-cloud/blocks/article/article.js#L76

iansk commented 1 year ago

@maxakuru Gotcha, thanks for clarifying that final detail.

Closing this issue. No work required.

(This works fine, I just didn't understand what was happening under the covers, and where the date was coming from.)