jgehrcke / github-repo-stats

GitHub Action for advanced repository traffic analysis and reporting
Apache License 2.0
302 stars 41 forks source link

Several questions after first use: actions.yaml, PR to main, using GH Pages #48

Closed evil-shrike closed 2 years ago

evil-shrike commented 2 years ago

Hi, thanks for workflow, can I ask a couple of questions:

andry81 commented 2 years ago

@evil-shrike You can take a look on much simpler solution without any visual reports: https://github.com/andry81/github-accum-stats

jgehrcke commented 2 years ago

Hello @evil-shrike!

Thanks for the questions and feedback. Trying to go through this point by point.

what is actions.yml file mentioned in README? Where is it suppose to be placed?

I assume you refer to the example workflow YAML document in https://github.com/jgehrcke/github-repo-stats#setup. Right above the YAML the README currently says

Create a GitHub Actions workflow file in the data repository (in the example this is the repo bob/private-ghrs-data-repo). Example path: .github/workflows/repostats-for-nice-project.yml.

That should answer the question. Do you think that needs further clarification?

After I run my workflow, I got a PR to main that I had to manually merge. Will it happen after each run? I'd expect some fully automated solution...

I am puzzled. This GitHub Action does not submit pull requests. Is there some funny mix-up going on here? Are you asking in the right repository? :)

This GitHub Action here is -- by my definition -- rather fully automated. It commits to the data branch in the data repository -- autonomously.

last-report/report.md contains some script tag

I have an intuition about where you're going with this observation. It's important to understand that the Markdown source is not trying to be compliant with GitHub Markdown flavor.

The Markdown source is compliant with Pandoc Markdown with the following extensions enabled:

Pandoc is invoked with this command line flag: --from=markdown+pandoc_title_block+native_divs". See code.

There is also a helpful answer on Stack Overflow.

I made that choice to be able add for example containers of the following kind:

<div class="pagebreak-for-print"> </div>

This special Markdown flavor also makes it easy to put raw <script> tags into the Markdown source and have them end und up in Pandoc's HTML output.

I consider the Markdown source to be an implementation detail of this project; what we care about is the PDF and HTML output.

is it WAI?

I do not understand the question. Do you refer to the Web Accessibility Initiative?

it'd be great to use GH Pages for generated html

Yes. I do that! Here is an example: https://github.com/jgehrcke/ghrs-test/blob/de6e5e4be1cb4ff258ba03cce05ef5509e8f92b0/.github/workflows/github-repo-stats.yml#L18

why not to put a latest report to docs folder that GH Pages support?

I am not sure if I really understand the intent. Once you activate GitHub Pages for the data repository you will be able to construct a URL to the generated HTML; regardless of where it lives.

As shown in the example above, you can use the ghpagesprefix parameter so that this GitHub Action will generate a link to the GH Pages-exposed report HTML in the automatically generated README. Example: https://github.com/jgehrcke/ghrs-test/tree/de6e5e4be1cb4ff258ba03cce05ef5509e8f92b0/jgehrcke/github-repo-stats (look for the last line in this README).

jgehrcke commented 2 years ago

You can take a look on much simpler solution

Curious: what makes us think that andry81/github-accum-stats is simpler? :) @andry81 I would appreciate specific suggestions for simplifications of this project here if you have them. Thank you!

andry81 commented 2 years ago

You can take a look on much simpler solution

Curious: what makes us think that andry81/github-accum-stats is simpler?

Because storage of generated reports in repositories is a quite of overkill. You can store the statistic, the report you can always generate locally.

jgehrcke commented 2 years ago

Thank you for clarifying, @andry81. Well. Each project has its priorities -- auto-generating a nicely readable HTML and PDF document covering an arbitrarily long time frame of data is the main purpose of this project. That's precisely the non-trivial part, the way I look at it. In that sense maybe it's not comparable to yours :).

I agree that the corresponding git repository will accumulate a certain size as of a "lot" of byte changes as of PDF and HTML being in the repo. From Earth's point of view :earth_africa: that could be done in a way that consumes less resources. But that's nothing for the user to worry about. We brutally make use of GitHub business decisions -- providing free storage as well as GitHub Pages. Also, as of compression applied the git data throughput isn't large at all.

andry81 commented 2 years ago

Also, as of compression applied the git data throughput isn't large at all.

Better to rewrite the repo below the head removing obsolete data and no need to compress.

jgehrcke commented 2 years ago

Interesting thought, sure!

evil-shrike commented 2 years ago

Hello @jgehrcke

Thank you for your comprehensive answer.

what is actions.yml file mentioned in README? Where is it suppose to be placed?

I assume you refer to the example workflow YAML document in https://github.com/jgehrcke/github-repo-stats#setup. Right above the YAML the README currently says

Create a GitHub Actions workflow file in the data repository (in the example this is the repo bob/private-ghrs-data-repo). Example path: .github/workflows/repostats-for-nice-project.yml.

That should answer the question. Do you think that needs further clarification?

Yes, I'm referring this guide, but below the mentioned text there's "Input parameter reference" section which contains "Extract from action.yml:" with some yaml. It wasn't clear where should I put that "action.yml" if at all. Now I understand it's just description of supported parameters in workflow file.

After I run my workflow, I got a PR to main that I had to manually merge. Will it happen after each run? I'd expect some fully automated solution...

I am puzzled. This GitHub Action does not submit pull requests. Is there some funny mix-up going on here? Are you asking in the right repository? :)

This GitHub Action here is -- by my definition -- rather fully automated. It commits to the data branch in the data repository -- autonomously.

yes, my bad, it does commit to "github-repo-stats" branch, but at the same after the first run there was a PR created for main:

image

I didn't make it up, honestly! :)

last-report/report.md contains some script tag

I have an intuition about where you're going with this observation. It's important to understand that the Markdown source is not trying to be compliant with GitHub Markdown flavor.

The Markdown source is compliant with Pandoc Markdown with the following extensions enabled:

* https://pandoc.org/MANUAL.html#extension-raw_html

* https://pandoc.org/MANUAL.html#extension-native_divs

Pandoc is invoked with this command line flag: --from=markdown+pandoc_title_block+native_divs". See code.

There is also a helpful answer on Stack Overflow.

I made that choice to be able add for example containers of the following kind:

<div class="pagebreak-for-print"> </div>

This special Markdown flavor also makes it easy to put raw <script> tags into the Markdown source and have them end und up in Pandoc's HTML output.

I consider the Markdown source to be an implementation detail of this project; what we care about is the PDF and HTML output.

Ok, can I suggest to not generate it by default because it's a bit confusing. Just add an option (e.g. generate-pandadoc-md: true).

is it WAI?

I do not understand the question. Do you refer to the Web Accessibility Initiative?

sorry, WAI stands for Work As Intended

it'd be great to use GH Pages for generated html

Yes. I do that! Here is an example: https://github.com/jgehrcke/ghrs-test/blob/de6e5e4be1cb4ff258ba03cce05ef5509e8f92b0/.github/workflows/github-repo-stats.yml#L18

why not to put a latest report to docs folder that GH Pages support?

I am not sure if I really understand the intent. Once you activate GitHub Pages for the data repository you will be able to construct a URL to the generated HTML; regardless of where it lives.

As shown in the example above, you can use the ghpagesprefix parameter so that this GitHub Action will generate a link to the GH Pages-exposed report HTML in the automatically generated README. Example: https://github.com/jgehrcke/ghrs-test/tree/de6e5e4be1cb4ff258ba03cce05ef5509e8f92b0/jgehrcke/github-repo-stats (look for the last line in this README).

Ok, it was a long time ago as I set up GH Pages, so I was a bit confused by the description where it's suggested to choose between "root" and "docs" folder:

image

So my idea was is that as soon as we have a specific repo for stats then probably it'd make sense to generate assets in such a way that people get some site with the stats automatically. You explained that the generated markdown isn't supposed to be used on GH, ok we can't use repo's readme. Then we have a pdf and html. But they are put inside nested folders, like repo-name/project-name/latest-report/report.html. So I neither have the stats in my README, nor a project site created by activating GH Pages. I have to go by a long nested path: http://me.github.io/my-stat-repo/org/project/latest-report/report.html Can we setup the workflow to generate html inside root's docs folder? So that a project's GH Pages site would show the reports by default.

jgehrcke commented 2 years ago

quick feedback on this point:

"Extract from action.yml:" with some yaml. It wasn't clear where should I put that "action.yml" if at all.

Gotcha. Yeah action.yml is at the root of the repository for telling GitHub about the input parameters of this action: https://github.com/jgehrcke/github-repo-stats/blob/074445ed37842074a09253bb2d638b96a613cf5d/action.yml -- as the user you don't need to know about that file itself. The contents though are critical (enumeration of parameters, and help texts). I might look into removing the mention to action.yml in one of the next documentation iterations.

jgehrcke commented 2 years ago

I didn't make it up, honestly! :)

Well, but https://github.com/evil-shrike/triggerator-stats/pull/1 was opened by your identity, the evil-shrike user. So. Hm. :).

jgehrcke commented 2 years ago

Ok, can I suggest to not generate it by default because it's a bit confusing. Just add an option (e.g. generate-pandadoc-md: true).

pandoc, not pandadoc :D :panda_face:. :)

You expected to see just the HTML and PDF documents, but not the Markdown file, yes? So that would be more in the lines of commit_markdown: false. Maybe.

jgehrcke commented 2 years ago

I have to go by a long nested path: http://me.github.io/my-stat-repo/org/project/latest-report/report.html

Making path components configurable is indeed valuable thinking. From a development point of view, though, it blows up complexity quite a bit and demands quite a bit of testing. That's why I think I wouldn't do this 'quickly tomorrow'. But yeah, keeping this on the horizon!

jgehrcke commented 2 years ago

Closing this for now.