HTTP API for the challenge server

morhekil commented 9 years ago

Based on the discussions started in #2 and #3, let's also have some discussion about the API part of the challenge server. Given some positive feedback that was received on the idea of leveraging some of exercism's tools, we should probably stick with the API to begin with, too. It will allow us to quickly validate the proof of concept of the fetching part even without building our own client at all, and focus on the submission part of CLI, and evaluators / public features of the website.

I couldn't find any formal protocol definition, but all of their view code is collected here, and we'll just need to expose our own challenge data as "problems", with the overall challenge stream becoming a "track" (maybe have more side-tracks later for non-evaluated advanced challenges, or warmup exercises?)

Also, as I mentioned in #2 , we could build on the idea of using submodules to incorporate various challenge setups, but instead of mapping them on to "tracks" - link to individual challenges as submodules. In this case, all that challenge authors need to do is to set up the challenge as a git repo, and let the maintainers include it as a submodule - I think this is easy enough and convenient for everyone.

Thoughts?

morhekil commented 9 years ago

As @theonejb proposed in #2 :

Just a very basic thing that accepts auth as a header "Auth-ApiKey" and has the three endpoints which we talked about:

/v1/challenge/current/ /v1/challenge/<CHALLENGE ID>/ /v1/challenge/<CHALLENGE ID>/submit/ And just save the submissions to disk? Possibly have an endpoint that can list those submissions as well with download links?

gedex commented 9 years ago

@morhekil My suggestion is to create todo-list, something like:

[ ] List of challenges
```
GET /v1/challenges/
```
Endpoint documentation
[ ] Get challenge
```
GET /v1/challenges/:id
```
Endpoint documentation

This allows reviewer to check the list against opened PR whether implementation is satisfied or not. The Endpoint documentation link can be pointed to wiki on this repo (we can mimic GitHub API documentation).

I think the current issue is opened for the HTTP API discussion. However, once spec has been defined we can close this issue and open a dedicated issue with todo-list format. What do you think?

/cc @IndianGuru @sausheong

gedex commented 9 years ago

I'm expanding from @theonejb proposal and copying GitHub API doc.

List challenges
Get a single challenge
Create a submission
Edit a submission
Delete a submission

List challenges

List all challenges.

GET /v1/challenges

Parameters

Name	Type	Description
status	string	Indicates the status of the challenges to return. Can be either: `open`, `closed`, or `all`. Default to `open`.

Response

[
    {
        "id": 1,
        "url": "https://example.com/challenges/1",
        "status": "open",
        ...
    },
    ...
]

Get a single challenge

GET /v1/challenges/:challenge_id

Response

{
    "id": 1,
    "url": "https://example.com/challenges/1",
    "status": "open",
    ...
}

Create a submission

POST /v1/challenges/:challenge_id/submissions

Input

Name	Type	Description
files	object	Required. Files for submission.

The keys in files object are the string filename, and the value is another object with a key of content, and a value of the file contents. For example:

{
    "files": {
        "drum.go": {
            "content": "",
        }
    }
}

Response

{
    "id": 3,
    "url": "https://example.com/challenges/1/submissions/3/",
    "files": {
        "drum.go": {
            "content": "",
        }
    },
    "user": {
        ...
    },
    "created": "2010-04-14T02:15:15Z",
    "modified": "2010-04-14T02:15:15Z"
}

Edit a submission

PATCH /v1/challenges/:challenge_id/submissions/:submission_id

Delete a submission

DELETE /v1/challenges/:challenge_id/submissions/:submission_id

Response

Status: 204 No Content

morhekil commented 9 years ago

@gedex good job, we'll need a spec to let both CLI and API be on the same page. Maybe let's flll out the rest of the methods, and make a wiki page out of it?

I just had a bit of time today to write some code for this, so I pushed first bits and pieces for the API to my fork of this repo. Just the skeleton for the moment - I stubbed out the actual challenge storage with a map, and the challenge structure itself is just a quick slime with ID and Name, that can be queried and sent back as JSON.

I hope to have some time tomorrow, too, to flesh out at least the challenge part of the API, and write some tests against this protocol proposal that we can build on going forward.

@theonejb are you happy with this format for API spec, to make CLI tool speak the same language, too?

gedex commented 9 years ago

@gedex good job, we'll need a spec to let both CLI and API be on the same page. Maybe let's flll out the rest of the methods, and make a wiki page out of it?

Ok. Moved to https://github.com/GoChallenge/gochallenge/wiki/API-v1.

morhekil commented 9 years ago

awesome!

Just a few thoughts after reading it in full:

We should probably document GET /challenge/current call, too - as a shortcut for CLI to get the current challenge data.
Do we really need edit/delete submission calls? I reckon "update" would be just another submission, as the latest one will always be considered "current". Not sure why would anyone want to delete a submission, too - @IndianGuru, did you have any requests to do it during the first challenge?

gedex commented 9 years ago

We should probably document GET /challenge/current call, too - as a shortcut for CLI to get the current challenge data.

Is the expected response a single latest open challenge?

Do we really need edit/delete submission calls? I reckon "update" would be just another submission, as the latest one will always be considered "current". Not sure why would anyone want to delete a submission, too - @IndianGuru, did you have any requests to do it during the first challenge?

The edit call allows participant to send subsequent iteration (which should be blocked once the submission is closed). I saw some participants on the channel asking whether it's possible to send updated code. The delete is provided because after all the submission is belong to participant. However, it's back to the organiser whether both endpoints are needed or not.

morhekil commented 9 years ago

Is the expected response a single latest open challenge?

Correct, or an error if there isn't one.

The edit call allows participant to send subsequent iteration

I believe that PATCH method might not the best one to express this action then. After all, every submission is a separate artefact - say, a zip file and its associated metadata. I'm happy to stick with POST method to create submissions only, and when a participant wants to overwrite an earlier submission - they just create a new one, and it takes precedence. So I'm not saying that we don't need updates at all (I strongly believe that we do), but in this context they're technically just a sequence of creates (POSTS), rather than PATCH-like updates.

Does it make sense?

theonejb commented 9 years ago

@morhekil Yeah, the API endpoints look sane and are good enough as a reference for the CLI tool. @gedex Amazing job. Thank you! Having this in a proper doc should make life easier for all involved. :+1:

I tend to agree with @morhekil that a DELETE endpoint isn't required, at least at the moment. Just submitting another set of files should overwrite the previous submission. And if we do require a delete, having it in the CLI might be overkill. We should probably have it on the web app side, so a user can ask to delete his submitted code. More explicit that way, and less chance of wrong deletions.

@gedex For the /v1/challenges/current call, my original intention was for it to return just an ID, that the CLI would then use to get the challenge info via the /v1/challenges/:challenge_id.

morhekil commented 9 years ago

@gedex For the /v1/challenges/current call, my original intention was for it to return just an ID, that the CLI would then use to get the challenge info via the /v1/challenges/:challenge_id.

isn't it easier to accept full challenge data in response? Saves a roundtrip to the server, too :)

idawes commented 9 years ago

Agreed, the current call should return the current challenge data instead of an ID.

gedex commented 9 years ago

Ok.

/v1/challenges/current is added and returns challenge object.
Update and delete endpoints were strike out and marked as TBD.

idawes commented 9 years ago

@gedex: Awesome.

Are we missing documentation of the auth param?

gedex commented 9 years ago

Are we missing documentation of the auth param?

What the expected method for Authentication? Just token key in the header?

theonejb commented 9 years ago

Makes sense that way too. Plus @morhekil You're right, saves a roundtrip.

theonejb commented 9 years ago

@gedex For the auth param, I was thinking just a header called Auth-ApiKey that is sent with every request from the CLI tool.

gedex commented 9 years ago

Ok. Page is updated with authentication information.

theonejb commented 9 years ago

@gedex Awesome. I just looked at the wiki page. Looks great. Thanks!

I'll get started on the tool this weekend. Hopefully, shouldn't be more than a few hours of work. I expect to have something we can start testing by Saturday evening (GMT +4).

morhekil commented 9 years ago

that's looking good, I believe that challenge-related calls and submission creation should cover the basic functionality.

I gave some thought to the delivery of challenge data, and I think that we should avoid serving it as directly in HTTP response - if the challenge takes off, and if there's a non-trivial data component to one of the challenges, it might put some extra strain on the server (in terms of traffic). I suggest to provide CLI with a URL to the file location instead, and the backend will take care of making the challenge data available at the specified location.

The benefits of this approach that I can see:

We can push the challenge data out to a CDN, let it worry about the traffic requirements and run the API server on something lightweight.
Also, CDN will give a performance boost to the actual downloads, as it will serve the data with a point that is closest to participant's location.
This approach also plays nice with a browser-based workflow, that has been suggested on Slack - e.g. if someone hits GET challenge API with a browser (which we can detect by text/html content type of the request), we can just respond with a redirect to file's real location, and let that endpoint worry about caching etc.

Of course, we can start with simple self-served file delivery, but I'd like to encode the location based workflow in the API, to let CLI tool handle it transparently once it's out in users' hands.

Thoughts?

idawes commented 9 years ago

Sounds like good future proofing and shouldn't complicate things too much. Of course it brings back the server round trip that we had previously argued against, but this seems like it would bring significant benefit in the long run.

morhekil commented 9 years ago

it was proposed to leverage existing go tools, and make challenge code available in a go get-able form. The documentation suggests that we can even alias repository for go get by redirecting the tool via a page with a special meta-tag in it, which sounds like a nice way to link in and namespace all challenge repositories under the same directory name in GOPATH.

I added import key to the description of challenge data, so that CLI tool should basically to go get using that URL to get the challenge code.

IndianGuru commented 9 years ago

@morhekil - we have had many requests for re-submission of code for the first challenge. We have been accepting this. However, it has been decided that from the next challenge re-submissions will not be accepted.

Also, one participant who had earlier submitted his/her solution requested us a week later to delete his/her submission and withdrew from the challenge.

IndianGuru commented 9 years ago

@gedex I had some queries related to:

List challenges - will this be accessible to the public? What if a challenge author has already entered his challenge but should be made public at a later date?
Edit Challenge submission - Again, if a participant has already submitted his challenge, he/she should not be allowed to edit it nor delete it. Only the admins would have access to do the same.

Am I thinking in the right direction?

IndianGuru commented 9 years ago

It just struck me that there are many types of submissions:

Normal
Just for Fun
Bonus
Anonymous

For the first three we require the participant's name, email id and one profile (GitHub, Twitter etc.) along with the zip file which is named differently by each participant.

For the last one we definitely require his/her email id and the zip file.

Also, none of the solution evaluators nor the challenge setter should know the name of the participant. The zip file has to be sanitised (remove any references to the name of the participant in the say README file; remove .git repos etc.) and then given to them.

morhekil commented 9 years ago

However, it has been decided that from the next challenge re-submissions will not be accepted.

Is this because of the manual factor involved in handling this, or does it throw a wrench into the evaluation process (e.g. the submissions are evaluated as soon as they're received, and not at the end of the challenge)? I'm just thinking that if it is the former, than automating the process should help to resolve the labor part of it.

morhekil commented 9 years ago

@IndianGuru

List challenges - will this be accessible to the public? What if a challenge author has already entered his challenge but should be made public at a later date?

It is accessible, but paired with some metadata (e.g. challenge timeframe, etc), and future challenges are not exposed until their launch date. I don't see a reason to block access to past challenges - people should still be able to download them, but the submissions will be blocked, of course - do you agree?

Edit Challenge submission - Again, if a participant has already submitted his challenge, he/she should not be allowed to edit it nor delete it. Only the admins would have access to do the same.

That was the result of our conversation above, too. Submissions are immutable, and can only be overwritten by participant's later submissions (if we decide to allow that after all).

morhekil commented 9 years ago

@IndianGuru

For the first three we require the participant's name, email id and one profile (GitHub, Twitter etc.) along with the zip file which is named differently by each participant.

This should be all configured by CLI tool and stored locally, and we can receive them with the subission later and update our data.

For the last one we definitely require his/her email id and the zip file.

zip file is also generated by CLI tool automatically, just as a step in submission workflow - participants shouldn't need to zip things up manually anymore.

Also, none of the solution evaluators nor the challenge setter should know the name of the participant. The zip file has to be sanitised (remove any references to the name of the participant in the say README file; remove .git repos etc.) and then given to them.

I'm not sure if/how we can guarantee this without a human looking at the submission. I guess we could display a warning on submission, advising the participant that the source code should not contain any references to his name. We can also run some basic sanity checks on submission, just grepping the code for the presence of profile data - name, email, etc.

sharang-d commented 9 years ago

@IndianGuru Regarding list challenges i think we are coming to user roles. Like the most basic admin and user. The user will see only past and ongoing challenges while the admin can view unreleased too.

Anyway here's a sample challenge object:

{
    "id": 1,
    "url": "https://golang-challenge.com/challenges/1",
    "status": "open",
    "import": "https://git.golang-challenge.com/challenge1",
    "author": {
        "name": "Sharang Dashputre",
        "twitter": "@_SharangD",
        "github": "https://github.com/sharang-d",
        "email": "sharang.d@gmail.com"
    },
    "start_date": "Timestamp(with timezone)",
    "end_date": "Timestamp(with timezone)"
}

"author" needs to have at least one key and others can be optional. Let me know what you think.

morhekil commented 9 years ago

@sharang-d I think "released" and "status" is the same thing, or am I reading it wrong?

sharang-d commented 9 years ago

Oh! My bad. I think we need a new status in that case. Something on the lines of "unreleased" ? Can't think of a name. Editing the post to remove "released".

morhekil commented 9 years ago

well, thinking about it again - unauthorised users will never see this record at all really, as it will be filtered on the server and never sent to them. So I have some doubts now that we really need that all, or do we?

sreejithr commented 9 years ago

In Create Submission (POST /v1/challenges/:challenge_id/submissions) endpoint (doc), why are we sending the source code in plaintext in the request?

Why not tarball the file and send it in a multipart/form-data POST? Is it a security thing?

sharang-d commented 9 years ago

@morhekil Authorised users can be either admin or user. We will need the unreleased status to reply with when an admin hits the API.

This is assuming that the API will be available to both admins and users.

morhekil commented 9 years ago

@sharang-d yep, makes sense, I guess I was thinking mostly from participant's perspective. I put some notes on the wiki, documenting possible statuses and the filtered access behaviour, feel free to make more changes if I missed anything else!

@sreejithr submission endpoint is at the first-idea stage, so no - it's not a security thing, we just haven't got up to that stage yet :) Maybe open another issue with your ideas, and let's start the discussion there - to separate these two threads?

IndianGuru commented 9 years ago

@morhekil for the first challenge, we have 6 evaluators and we have divided them into 2 teams of 3 each. As of now we have 133 submissions (we will get some more before we end the challenge in 16 hrs from now). Each team evaluates 50% of the submissions.

Techinically an evaluator gets 20 days to evaluate all the submissions and we have been requesting the evaluators to start evaluating a submission as soon as we receive the same. In reality the evaluators can effectively spend time only during the weekends.

Resubmissions is a problem.

mathcunha commented 9 years ago

@morhekil, once the cli will submit a tarball, I think that @sreejithr made a good point. And we could update the json based on the comments at issue #2

GoChallenge / gochallenge