SAP / project-portal-for-innersource

Lists all InnerSource projects of a company in an interactive and easy to use way. Can be used as a template for implementing the "InnerSource portal" pattern by the InnerSource Commons community.
https://sap.github.io/project-portal-for-innersource/
Apache License 2.0
143 stars 71 forks source link

Extend docs about crawler logic and references implementations #18

Closed spier closed 3 years ago

spier commented 3 years ago

Hi @Michadelic,

I had always wanted to add links from your project to the crawler implementations that already exist. That makes it easier for users to get from using the mock data to using their own project data.

While working in this I was unsure where in your docs to put the documentation about the crawler implementations.

I think it would be best to consolidate all info related to the repos.json structure and the crawling in a single place.

I put a quick WIP PR together here, so that you can roughly see what I have in mind.

What I did:

Before I go any further with this I wanted to confirm with you if you would be on board with the general approach.

This still contains a lot of redundant data now etc. Will clean that up when you give me the green light :)

spier commented 3 years ago

I was thinking about this a bit more, and I have a suspicion now why the instructions for how to get a project listed in the portal were kept in CONTRIBUTING.md.

Is it because when this was a SAP-internal project, getting any repo listed in the portal was considered a "contribution"?

Just trying to understand this so that I can maybe make more helpful proposals to consolidate the documentation.

Michadelic commented 3 years ago

I was thinking about this a bit more, and I have a suspicion now why the instructions for how to get a project listed in the portal were kept in CONTRIBUTING.md.

Is it because when this was a SAP-internal project, getting any repo listed in the portal was considered a "contribution"?

Just trying to understand this so that I can maybe make more helpful proposals to consolidate the documentation.

Hi @spier, you are right, we consider listing a project a tiny contribution, so we kept this info in CONTRIBUTING.md. However, as the project matures, i think its better to consolitdate all info about the crawling into a separate file. Will take a look at the code now

spier commented 3 years ago

Sounds good @Michadelic. I will push some further commits to this PR later today, with a first stab of how I think the consolidated documentation could look like.

spier commented 3 years ago

Hi @Michadelic. I wasn't 100% sure what you meant by "looking at the code" above :)

Could you please confirm again if I should go ahead with this PR and consolidate all info related into CRAWLING.md?

Michadelic commented 3 years ago

@spier sure, i meant look at the code of this PR, i think it's already in a decent state. I just would keep instrucitons how to list a project separate from how to implement a crawler. What do you think?

spier commented 3 years ago

@Michadelic that sounds good.

I think that the issue with "which docs should go where" arises due to different personas. Let's call them user and maintainer.

A user that wants to get a project listed in the portal only needs the Listing Project in the Project Portal for InnerSource instructions. All further docs about how the portal works from a user's perspective should live in the portal itself, and not in this repo. One could even consider to move the description section from the README into the portal itself.

The maintainer on the other hand needs to know how to install the portal, how to host it, and how to write a customized crawler that works for their org. So those maintainers need all the docs in this repo. The maintainers would also be the ones making upstream contributions to this repo, in case of bugs/features/etc.

Proposal

Given the thoughts on these two personas, how about about this approach.

For the user

We create howto-use-this-portal.md as the docs for the user. It would contain:

We can then even link to this file when a user clicks the +-button in the demo portal, instead of the placeholder URL https://yourcompany.corp/innersource-instructions.

For the maintainer:

We keep things mostly as they are in the current README, maybe calling out explicitly that this documentation is only required for maintainers.

We extend it with further info about the crawling process. I would start with what I got in CRAWLING.md now. If the README ends up being too long for our liking, we can break it out into a separate file later.

What do you think?

I can create a rough draft of the proposed changes, and push them to this branch. Maybe it is easier to understand if it "feels right" once we can look at something?

zkoppert commented 3 years ago

Anything I can do to help here?

spier commented 3 years ago

Thanks for the offer @zkoppert. Not that I would know right away.

The documentation approach is relatively clear, I just have to find the time.

As part of this documentation I also wanted to make my crawler implementation public, but maybe that is just scope creep with which I am making the problem too hard for myself.

Will try to find the time to wrap this up "soon" :)

spier commented 3 years ago

This PR is ready for a proper review now.

Open questions are:

  1. how to point from README to LISTING?
  2. in CRAWLING steps 2 and 3 are optional. Some of the bullets below those steps are marked as optional too. That seems redundant. We should validate that this data is indeed optional and that the portal works without them
  3. CRAWLING mentions that "You can write the API response to repos.json as is". That might not be 100% correct. Maybe we can fix that when adding a curl/jq example of the most minimal crawling approach :)

Looking forward to your feedback.

Michadelic commented 3 years ago

thank you very much for this great contribution, the documentation is now much more readable!