Closed panges2 closed 4 months ago
Looks great!
Userdata from github and comments will be available also to scrape?
How will this be connected to KYC?
Great proposal! This also helps mitigate some risks mentioned in #793 (similar to @dkkapur's suggestion https://github.com/filecoin-project/notary-governance/discussions/793#discussioncomment-4269944). With this proposal we could use git natively for backups without the overhead of parsing every issue and comment, and enables us to use some other nice GitHub features like workflows/actions more effectively.
Some quick thoughts:
In order to make this new flow we need to implement a system of new branches, commit and PRs.
When a new application is created, or there is a new datacap request, we need to:
We need to decide if having a unique, big JSON file with all the applications, or as many files as application: 1 big JSON file: pros: will be easy to scrape it, we will have all the info there. cons: if we merge 2 or more branches at the same time we will have a conflict and the merge won't take place, resulting in human intervention to unblock this situations. If there is any mistake, reverting the PR can be very hard and not secure
many files: pros: we shouldn't have conflict problems cons: we will have more than 1k files in the repo, we will have the same problem we have right now with fetching limits
I agree with fabrizio about both technical considerations and the pros/cons of having 1 or many JSON file. I think we should discuss the above 3 items in more detail.
Maybe we can have JSON file periodically, such as every week, month. Maybe this can help both for merge and api limits problem.
Hi @fabriziogianni7 @huseyincansoylu
@cryptowhizzard
I support this proposal. The JSON format is API-friendly, but errors are likely to occur for LDN applicants, so an easy-to-use front-end tool is needed to help them generate this JSON
I've made a related broader discussion in #891, but I'd like to make some more specific points here:
Support applications from Github - Right now this flow assumes that the client will make new application at filplus.storage: but we should also support creating a PR directly shouldn't we? I imagine we still want to give users the ability to easily to git to submit their own applications, opening PRs on Github?
Preserve Github labels - I suggest we use labels largely the same way we do now, because they give users a clear idea of the state of a DataCap application; the key difference is that the labels would be applied to PRs instead of issues (which have the same data structure within Github)
Treat in-repo content as source of truth - This refers to moving all shared references we can into the repo(s), which gives us a common source of truth; consider the following:
This is a path forward that gives us the best of both worlds: (1) structure and automation and (2) transparency and community usability.
hey @orvn
I think it will still be possible to start an application through github, I dont see a reason why not. as soon as we define how the pull request need to look(branch name, pr title, pr files , etc..) its just a matter of following the standard. I am not sure that will be easier than creating an application from file.storage or through an issue, but it will definitely be possible.
I would rely solely on the data inside the json files as source of truth(at least from the code perspective), unless we define it otherwise in the standard.
- I think it will still be possible to start an application through github, I dont see a reason why not. as soon as we define how the pull request need to look(branch name, pr title, pr files , etc..) its just a matter of following the standard. I am not sure that will be easier than creating an application from file.storage or through an issue, but it will definitely be possible.
Yes, that's the goal. While most applications will come through filplus.storage or other future similar channels, the user should still be able to make a PR. A PR template could enforce some body format at least. I don't know if the forked branch naming convention matters too much (but if you have a reason I'm not thinking of lmk @jbesraa!)
- I would rely solely on the data inside the json files as source of truth(at least from the code perspective), unless we define it otherwise in the standard.
Agree, the json
s are the source of truth now, so basically we would want them merged into the codebase as efficiently as possible, so that the main
branch is always as current as possible. The repo itself will only represent active DataCap allocations, and not unapproved applications like the issues do right now, correct?
We cant use forks if we want to keep using issues as an entry point(ie creating an application through an issue that will be transferred to PR). because we cant fork without having OAuth from the user.
Active pull requests will represent one of the following: 1 - new application 2 - removal request 3 - refill request
TLDR
Update how we store information on GitHub to make everything better. If you are a notary it should not change how you interact with the system.
Context
The current system for storing data of all Fil+ LDN application data on Github as comments in issues is not scalable for various reasons. In this proposal, we aim to address the issues with the current system by updating the way core Fil+ LDN data is stored on GitHub.
Issues with the Current System:
Role of Github
Why still use Github? GitHub provides a simple way to store large amounts of data with the following advantages:
Proposed new application flow
Benefits of this new flow
Application Schema
Note: Still subject to change
Timeline
week 1-2: finalizing discussion week 3-4: finalizing design internally week 5-9: implementation week 9-14: testing and fixes
Technical dependencies
Tooling for registry and SSA bot will have to change and be redeployed
End of POC checkpoint (if applicable)
week 14
Risks and mitigations