onelson / estuary

33 stars 8 forks source link

Show whether a crate version has been yanked. #18

Open mattdeboard opened 3 years ago

mattdeboard commented 3 years ago

The fact the crate data is harvested from the git reflog influences how I approached this. We're basically iterating over log lines, but since a single package & version pair can have multiple reflog lines, I needed a way to keep track of what package/version pairs I've already seen, and whether or not they have been yanked.

HashMap<(String, String), bool> seemed like a sensible choice for data structure here for fast lookups & duplicate elimination. The get_publishes function is only responsible for generating the tell-tale data structure. The logic to "massage" that data into display format is handled in frontend::landing.

mattdeboard commented 3 years ago

I need to update this to allow for yank --undo overwriting the bookkeeping

edit: And actually I'm not sure if the "parse reflog" approach will still work. The reflog is basically time series data given in reverse. So maybe just parsing the JSON for each crate and all is the better way. I'll work on that.

onelson commented 3 years ago

I actually never figured out how to unyank via cargo. TIL!

onelson commented 3 years ago

So maybe just parsing the JSON for each crate and all is the better way. I'll work on that.

Parsing the index files makes the most sense in terms of getting the package info, correct. The thing that's missing is when things happened.

mattdeboard commented 3 years ago

If we create our own command to invoke a custom command line, e.g. git reflog —date=iso, and read the reflog lines from there, we can get the dates in the log message, even tell git exactly how we want the date formatted so we dont have to do that in rust code.

onelson commented 3 years ago

A long term goal of mine will be to remove the need to shell out, so if we can avoid it for this, it'd be better. I'd be happy to see a dep on chrono added to assist with date parsing and formatting.

onelson commented 3 years ago

@mattdeboard I took a look at the diff and I think things are looking pretty good. I had a couple of thoughts.

Since the landing page currently acts a bit like an activity feed, this diff could be simplified along the lines of:

With that, we could go back to using a limit and skip the HashMaps.

The get_publishes() method could likely be renamed to account for it giving more than just the publish events. Likewise, updating the templates to say like "recent activity" and "all activity" or something.

How does that sound?