the-lean-crate / criner

A tool to mine crates.io and produce static websites
MIT License
125 stars 2 forks source link

Broken links on waste report #9

Open edmorley opened 2 years ago

edmorley commented 2 years ago

Hi

If I visit https://the-lean-crate.github.io/waste/ several of the generated package-detail links are broken.

For example the top package there, tree-sitter-parsers, links to: https://the-lean-crate.github.io/waste/tree-sitter-parsers

...which 404s.

And the 0.0.5 in "Waste in 0.0.5" links to: https://the-lean-crate.github.io/waste/tree-sitter-parsers/0.0.5.html

...which also 404s.

Byron commented 2 years ago

Thanks for reporting, I wasn't aware.

Here is what I could find out thus far.

  1. the state on disk works as expected, so running a local dev http server on my local working tree of https://github.com/the-lean-crate/waste can see the pages that aren't present on github.
  2. git status on said working tree revealed that a lot of files weren't actually checked into the git repository
  3. The files missing on GitHub were indeed added to the repo
    create mode 100644 web-tree-sitter-sys/0.3.0.html
    create mode 100644 web-tree-sitter-sys/0.4.0.html
    create mode 100644 web-tree-sitter-sys/0.4.1.html
    rewrite web-tree-sitter-sys/index.html (90%)
  4. When pushing this it would send 159025 new objects (changed files, new files, changed trees, new trees)
  5. Note that a clone of the waste repository prior to the commit above would also be missing the files in question

Indeed it looks like that criner fails to properly add new file to the tree. At some point it uses a git2 tree builder to achieve this quickly, maybe something goes wrong there.

For now I'd hold back on spending more time investigating, knowing that once I start using gitoxide for this, it will be reviewed naturally. Until then I hope its enough to work with the fixed report for now, which I can fix occasionally once there is demand.

Update: there seems to be more breakage even as by the looks of it, each tree only consists of the changed items, but doesn't contain any unchanged items. This would mean that in theory, each time there is an update only the updated files are visible in the report. For now I hold off even looking into it in favor of eventually moving this functionality to gitoxide.

duskmoon314 commented 2 years ago

I just find this awesome project and find the recent versions of my crate are missing. I think it is related to this issue.

https://the-lean-crate.github.io/waste/d1-pac/ only has 0.0.3 to 0.0.16, but I have released 0.0.23 3 days ago.