JoshMcguigan / traffic

Simple traffic reporting for your Github repositories
Apache License 2.0
35 stars 3 forks source link

Track git clones and forks? #4

Open alichtman opened 6 years ago

JoshMcguigan commented 6 years ago

I'd be open to a PR for this.

There are a couple of implementation details I'm not sure about. Currently, running traffic shows unique visits by default. I don't really want to require significantly more screen width, at least by default. So perhaps the clone/fork data could be hidden behind CLI arguments? Maybe something like --show-clones and --show-forks?

Also, while we're at it, it makes sense to also bring in the all traffic data (as opposed to unique visitors which is what traffic looks at currently). This should be hidden behind a CLI argument as well.

Finally, the trend column right now works on the unique visitors data. Would you see a trend column for each dataset the user has enabled? So if they are looking at unique visitors and clones by running traffic --show-clones then they would get a trend column for both? Or would the trend always be unique visitors?

Do you have any other thoughts on the implementation?

alichtman commented 6 years ago

Maybe something like --show-clones and --show-forks?

On board with this.

hidden behind CLI arguments

Just a thought: I think it'd be a good idea to add a --show-all argument to display all the data (both default and hidden behind command line args) once data starts being hidden behind certain args.

Also, while we're at it, it makes sense to also bring in the all traffic data (as opposed to unique visitors which is what traffic looks at currently). This should be hidden behind a CLI argument as well.

Agreed.

Finally, the trend column right now works on the unique visitors data. Would you see a trend column for each dataset the user has enabled? So if they are looking at unique visitors and clones by running traffic --show-clones then they would get a trend column for both? Or would the trend always be unique visitors?

Maybe the right move is a config file, where the user sets their preferences. I'm thinking of something like this (INI, YAML, etc.), and just leave the default config the way it is now.

display-clones: True,
display-clones-trend: True,
display-forks: True,
...

In terms of a PR -- I don't know Rust, and I've got my hands full for a while, so I probably will not be learning it. I'm unfortunately not the guy.

JoshMcguigan commented 6 years ago

Thanks for the feedback. I've added the good first issue label in hopes of encouraging a new contributor here. If anyone is interested in implementing this, let me know. I'd be happy to help in any way I can.

If it doesn't get picked up in the next few weeks then I'll take another look at implementing it myself.

SebRut commented 6 years ago

I'd be interested to try this. I've haven't got a lot of experience with rust yet, but am motivated to do something with it.

JoshMcguigan commented 6 years ago

@SebRut Go for it! Let me know if you need any help.

It actually looks like Github doesn't expose the fork counts, but they do have a clones endpoint.

https://developer.github.com/v3/repos/traffic/

According to the link above, the clone data actually mirrors the views data, so we should just be able to rename the Views, ViewsForDay, and ViewsForTwoWeeks data structures to something like Counts, CountsForDay, and CountsForTwoWeeks. Then, you can add the HTTP request to github.rs to retrieve the clones data.

alichtman commented 6 years ago

It actually looks like Github doesn't expose the fork counts

https://developer.github.com/v3/activity/events/types/#forkevent

The way I'd solve this problem is by crawling the event history of each event and looking for forkEvents.

For an idea of how this would work, check out: https://github.com/danwallach/github-event-times

SebRut commented 6 years ago

Seems like I'll have to pass on this, as termion isn't compatible with windows yet (https://github.com/redox-os/termion/issues/103) which makes compilation impossible for me.

JoshMcguigan commented 6 years ago

@SebRut Can you check out #6? I've created an issue and pull request for Windows compatibility and I'd appreciate your feedback.

SebRut commented 6 years ago

I added support for clones (SebRut/traffic@044079486866eb9eb0cf480694437ddaaeb6f8f9). Support for trends is also there, while not actually being shown for now. I'd appreciate any feedback on this :)

JoshMcguigan commented 6 years ago

@SebRut Thanks for taking this on. Feedback below:

Since the clone data actually mirrors the views data, we can rename the Views, ViewsForDay, and ViewsForTwoWeeks data structures to Counts, CountsForDay, and CountsForTwoWeeks. Then, those same data structures can be used for the clones data and you won't have all of that duplicated code in the views.rs file.

It would be nice if the network requests for the clones data wasn't made if the show-clones cli option wasn't set.

It's possible to reduce the code duplication in the get_formatted_output method. You can setup the clones data as a separate string, and if the show-clones cli option is set then include that string, otherwise use an empty string. That way you don't have to copy/paste all the rest of the output data.

Let me know if you have any questions, and thanks again for contributing.

SebRut commented 6 years ago

Since the clone data actually mirrors the views data, we can rename the Views, ViewsForDay, and ViewsForTwoWeeks data structures to Counts, CountsForDay, and CountsForTwoWeeks. Then, those same data structures can be used for the clones data and you won't have all of that duplicated code in the views.rs file.

Oh, big fail on my side, I was reading Count as Clone the whole time and was wondering what you meant with renaming, will do that.

It's possible to reduce the code duplication in the get_formatted_output method. You can setup the clones data as a separate string, and if the show-clones cli option is set then include that string, otherwise use an empty string. That way you don't have to copy/paste all the rest of the output data.

I guess we need to do a more modular construction of the output string anyway, with combinations of "only views", "views + clones", "views + clones + forks", etc. coming up sooner or later.

Thank you for the feedback.

SebRut commented 6 years ago

I just got back to working on this, and after renaming I stumbled upon a new error arising with this change, the deserialization from JSON. At the moment this, expects an "views" array of every CountsForTwoWeeks JSON even for the clones data, where the array is named "clones". I'm currently evaluating how to solve this, but I'd like to hear any ideas you got.

JoshMcguigan commented 6 years ago

Oh, interesting. Perhaps something clever could be done with the serde field attributes. I'd reach out to the serde developers as described in the serde help.

Alternatively, you could have separate structs for ViewsForTwoWeeks and ClonesForTwoWeeks, where they would each have a field which contains a Vec<CountsForDay>. Then implement From for conversion between ViewsForTwoWeeks and ClonesForTwoWeeks into CountsForTwoWeeks. RepoDetails would then have a field called views and separate a field called clones, but each would have the datatype CountsForTwoWeeks. This would allow the impl block that contains get_trend_uniques and get_trend_uniques to be written for type CountsForTwoWeeks, which means we don't have to duplicate that logic. You would then convert from ViewsForTwoWeeks and ClonesForTwoWeeks into CountsForTwoWeeks when you build each RepoDetails struct. Let me know if you have any questions about this approach.