fortran-lang / registry

Registry for Fortran package manager
MIT License
8 stars 3 forks source link

Package upload api #19

Open minhqdao opened 1 year ago

minhqdao commented 1 year ago

Let's define the api for uploading packages. How do you want a JSON request for uploading a package to look like? @arteevraina @henilp105

arteevraina commented 1 year ago

When @henilp105 & I created the upload API, it was supposed that information such as tags related to the project, dependencies will also come from frontend.

Should I remove these fields for now from the backend or just use a default value for all the packages. So, that the other APIs which are trying to fetch it does not break or can you send this data from fpm as well ?

The data that we might need but it's not necessary right now :

description, tags and dependencies

The data which is necessary and needs to be sent from fpm includes:

package_name, package_version, tarball, upload_token, license

cc: @minhqdao @henilp105 @perazz

henilp105 commented 1 year ago

Should we add time based expiry on the token or naming the tokens , as we would also have to add the functionality to revoke the tokens or expire them after certain time period.

arteevraina commented 1 year ago

Exactly @henilp105 , That's how it should be. But considering the time-frame we can maybe add later as well.

henilp105 commented 1 year ago

So, are we currently making the tokens irrevocable tokens which the user won't be able to re-view after creating them once. ( every time the user requests a new token would be generated. )

arteevraina commented 1 year ago

So, are we currently making the tokens irrevocable tokens which the user won't be able to re-view after creating them once. ( every time the user requests a new token would be generated. )

Yes, I think that works for now. But, we have to keep in mind that tokens should expire after certain amount of time & also an API + UI for deleting the generated tokens should be good.

arteevraina commented 1 year ago

The post request for the API should be sent in the form of form-data like this:

{
   "package_name": "some_package_name",
   "package_version": "X.X.X",
   "package_license": "MIT",
   "upload_token": "randomstringhere",
   "tarball": <attach_tar.gz>
}

I have changed the deployment to use vercel for now because it is fast than render.com & not causing any errors now. Url - https://registry-apis.vercel.app/

Endpoint :

<base-url>/packages
request-type - POST

I will separately DM the upload token that @minhqdao & @perazz can use to make the requests.

minhqdao commented 1 year ago

Should we add time based expiry on the token or naming the tokens , as we would also have to add the functionality to revoke the tokens or expire them after certain time period.

We have a createdDate of the token and therefore do not need to add an expiredDate or an expirationTimeframe. Instead, when the user wants to use the uploadToken, we check, e.g. when the token is supposed to be valid for one week, if the token was created less than a week ago. With that you only need the createdDate to check for validity.

minhqdao commented 1 year ago

The post request for the API should be sent in the form of form-data like this:

{
   "package_name": "some_package_name",
   "package_version": "X.X.X",
   "package_license": "MIT",
   "upload_token": "randomstringhere",
   "tarball": <attach_tar.gz>
}

I have changed the deployment to use vercel for now because it is fast than render.com & not causing any errors now. Url - https://registry-apis.vercel.app/

Endpoint :

<base-url>/packages
request-type - POST

I will separately DM the upload token that @minhqdao & @perazz can use to make the requests.

I'm not quite sure what you mean by sending form data. I hope we are not mixing frontend with backend techniques here. I will send data via the -d parameter (curl) or via --post-data (wget). It is the default way of sending data through a POST request.

We had this discussion already when we were thinking about sending a list of cached versions to the backend to determine whether we need to download a version or not.

minhqdao commented 1 year ago

When @henilp105 & I created the upload API, it was supposed that information such as tags related to the project, dependencies will also come from frontend.

Should I remove these fields for now from the backend or just use a default value for all the packages. So, that the other APIs which are trying to fetch it does not break or can you send this data from fpm as well ?

The data that we might need but it's not necessary right now :

description, tags and dependencies

The data which is necessary and needs to be sent from fpm includes:

package_name, package_version, tarball, upload_token, license

cc: @minhqdao @henilp105 @perazz

Yes, let's just ignore the optional fields for now and deliver them as soon as possible.

arteevraina commented 1 year ago

We have a createdDate of the token and therefore do not need to add an expiredDate or an expirationTimeframe. Instead, when the user wants to use the uploadToken, we check, e.g. when the token is supposed to be valid for one week, if the token was created less than a week ago. With that you only need the createdDate to check for validity.

I would agree here. This can be checked in the upload API itself.

arteevraina commented 1 year ago

I'm not quite sure what you mean by sending form data. I hope we are not mixing frontend with backend techniques here. I will send data via the -d parameter (curl) or via --post-data (wget). It is the default way of sending data through a POST request.

There must be a way in curl or wget to specify the headers in which you mention content-type as "multipart/form-data" ?

I am thinking to use multipart/form-data because it will enable the upload of tarball files.

https://stackoverflow.com/questions/19116016/what-is-the-right-way-to-post-multipart-form-data-using-curl

minhqdao commented 1 year ago

I'm not quite sure what you mean by sending form data. I hope we are not mixing frontend with backend techniques here. I will send data via the -d parameter (curl) or via --post-data (wget). It is the default way of sending data through a POST request.

There must be a way in curl or wget to specify the headers in which you mention content-type as "multipart/form-data" ?

I am thinking to use multipart/form-data because it will enable the upload of tarball files.

https://stackoverflow.com/questions/19116016/what-is-the-right-way-to-post-multipart-form-data-using-curl

Of course we could change the content type in the header to multipart/form-data. But in my understanding, that is actually a frontend technique for websites that you can use for convenience. The general way to send data to an api is using the data(-d) parameter, not the form (https://www.outsystems.com/blog/posts/consuming-multipart-form-data-rest-method/#multipart-form-data). Since we want a general api that connects to both websites and a CLI tool such as fpm, I think we should the data option, not the form.

arteevraina commented 1 year ago

Of course we could change the content type in the header to multipart/form-data. But in my understanding, that is actually a frontend technique for websites that you can use for convenience. The general way to send data to an api is using the data(-d) parameter, not the form (https://www.outsystems.com/blog/posts/consuming-multipart-form-data-rest-method/#multipart-form-data). Since we want a general api that connects to both websites and a CLI tool such as fpm, I think we should the data option, not the form.

We could use the get_json() parameter on the backend as well and then you could send the data with -d parameter but that was the case when only key value pairs are there. In our case, we have a file (along with key value pairs), which basically is making us to apply some constraints. https://flask.palletsprojects.com/en/2.2.x/patterns/fileuploads/

We are trying to access the file i.e tarball using request.files and for that to work frontend website/cli should attach the data as multipart-form.

I read the blog shared above, it also mentions "The HTTP content type for this format is “application/x-www-form-urlencoded.” This works fine for simple input fields, but web pages may also allow users to upload one or more files. To support this, another format was created in the early days of the web, and it is known by its content type, which is “multipart/form-data.”

minhqdao commented 1 year ago

I see, if multipart/form-data allows the simultaneous upload of files and data, then we'll stick to it. Thanks for the clarification. 🙏🏽

arteevraina commented 1 year ago

Okay, But now I see you can use something like this to upload as well.

curl -d @path/to/data.json https://reqbin.com/echo/post/json Looks like there is another way. I would suggest you can use multipart/form data for now. And I try sending a curl request to the API and see if I can change that. But, if it works then I think it will be a minimal change on backend and maybe on fpm as well.

minhqdao commented 1 year ago

Yes, I also think it's fine to stick to multiform/form-data for now (or forever).

arteevraina commented 1 year ago

Yes, I also think it's fine to stick to multiform/form-data for now (or forever).

Looks good. Then we can upload files from the frontend website as well.

Edit : Found that we could not then send the other metadata with file if we just use -d flag. So, we will be forced to use multipartform content-type

arteevraina commented 1 year ago

In the last meeting, I discussed with @perazz , about the current issue with having a namespace token for uploading packages.

Currently, namespace maintainers can create tokens that are used for uploading the packages to the namespace. But, we also have a package maintainer role. Package maintainers are only responsible for maintaining packages (uploading new versions, adding other maintainers to the package).

But, currently, in order to get the token you have to be a namespace maintainer. Suppose, I add a person as a package maintainer thinking now they can be able to upload new versions of that package and now they want token from me because I am a namespace maintainer and hence they can't work independently and have to be dependent on the namespace maintainer for upload tokens which basically questions the role of package maintainer in itself.

So, I have a solution to allow package maintainers to generate tokens for the package itself and the token can be then used to upload a new version of that particular package.

For the fpm, there will be no change as such required, we will be checking whether the token received is a namespace upload token or package specific token on the backend and send the correct response to the user.

@minhqdao @henilp105 @perazz What are your thoughts about this ?

minhqdao commented 1 year ago

In the last meeting, I discussed with @perazz , about the current issue with having a namespace token for uploading packages.

Currently, namespace maintainers can create tokens that are used for uploading the packages to the namespace. But, we also have a package maintainer role. Package maintainers are only responsible for maintaining packages (uploading new versions, adding other maintainers to the package).

But, currently, in order to get the token you have to be a namespace maintainer. Suppose, I add a person as a package maintainer thinking now they can be able to upload new versions of that package and now they want token from me because I am a namespace maintainer and hence they can't work independently and have to be dependent on the namespace maintainer for upload tokens which basically questions the role of package maintainer in itself.

So, I have a solution to allow package maintainers to generate tokens for the package itself and the token can be then used to upload a new version of that particular package.

For the fpm, there will be no change as such required, we will be checking whether the token received is a namespace upload token or package specific token on the backend and send the correct response to the user.

@minhqdao @henilp105 @perazz What are your thoughts about this ?

Thanks, @arteevraina. I think that's perfect. A package maintainer should be able to generate tokens for that particular namespace + package, but not for other packages of that namespace.