Scifabric / pybossa

PYBOSSA is the ultimate crowdsourcing framework (aka microtasking) to analyze or enrich data that can't be processed by machines alone.
http://pybossa.com
GNU Affero General Public License v3.0
745 stars 269 forks source link

Create get new task function and expose in API #25

Closed rufuspollock closed 12 years ago

rufuspollock commented 12 years ago

Location: /{api}/app/{id}/newtask/[{user-id}]

user-id is optional (may not be logged in!)

lucasmation commented 12 years ago

I have some ideas on how tasks should be atributed, which probably should be incorporated into PyBossa structure. Basically there are two ways people can colaborate:

Sequence j=1: JOB i=1,j=1,t=1 > JOB i=1,j=1,t=2 > JOB i=1,j=1,t=3 > ... > JOB i=1,j=1,t=n1 (when a user declares the transcription to be over.

Sequence j=2: JOB i=1,j=2,t=1 > JOB i=1,j=2,t=2 > JOB i=1,j=2,t=3 > ... > JOB i=1,j=2,t=n2

we would then have to compare (authomatically or by experienced users or project supervisors) the final iterations of each sequence: JOB i=1,j=1,t=n1 and JOB i=1,j=2,t=n2

@teleyinex , @rgrp : is this flexible sequential and/or parallel way of atributing tasks already build into PyBossa?

nigini commented 12 years ago

@teleyinex @rgrp This is maybe a Flask noob question, but I've searched the web and could not solve it yet! Can you try to help me with this?

I'm trying to create a new route at the API, but I can't stop receiving an 'HTTP 404'! As the route is different from the default ones (e.g. /app/1/newtask/1), I started creating a new 'MethodView', with a GET method. Then registered it with a different algorithm then 'register_api'. Is this the idea? The code is bellow (consider it a early^2 version).

class AppTaskAPI(MethodView):

    @jsonpify
    def get(self, app_id, command, user_id=None):
        bossa_app = model.Session.query(model.App).filter(model.App.id == app_id).one()
        if( bossa_app and not bossa_app.hidden ):
            #ToDo: This is a temporary solution!
            import random
            tasks = bossa_app.tasks
            task = tasks[random.randint(0,len(tasks))]
            return json.dumps(task)
        else:
            abort(404)

def register_apptask_api(view, endpoint, url, app='<int:app_id>', cmd='newtask', user='<int:user_id>'):
    view_func = view.as_view(endpoint)
    blueprint.add_url_rule('%s/%s/%s/%s' % (url, app, cmd, user),
        view_func=view_func,
        methods=['GET']
        )
    blueprint.add_url_rule('%s/%s/%s' % (url, app, cmd),
        view_func=view_func,
        methods=['GET']
        )

register_apptask_api(AppTaskAPI, 'api_apptask', '/app')
nigini commented 12 years ago

I'm making some tests with the Flickr demo... It appears that the "newtask" algorithm is working on the premise that TaskRuns are already existent; what is not true. I'll examine this now!

nigini commented 12 years ago

Yep! The "join(TaskRun)" at line 307 of new_task at model.py will crash if there is no TaskRun in the system yet! Yet! I could not capture this with a unit test (I'm learning how to use it!). @rgrp Can this like be commented until I finish a new algorithm for TaskRun creation? For the Flickr App it has no problem, but it may be used at other site!

rufuspollock commented 12 years ago

nigini no I would not comment this out. I'm also surprised it crashes if there are no TaskRuns (it just should not return anything). The real fix is probably a left join.

Aside: I'd also suggest not writing new core code until you are happy with unit tests as we should wherever possibly only add new code with tests ;-)

BTW: this issue is closed and can I assume you are not working on it?

nigini commented 12 years ago

OK! Issue closed. Not working on this, but yes on the algorithm to choose tasks.

BTW: I'm actually able to use unit tests, but never used it in a production environment in Python. Already got the idea of how it is used at PyBossa.

rufuspollock commented 12 years ago

But this task included the algorithm to choose tasks :-)

Would you like to update on what you want from algorithm to choose tasks so we can see if more work is needed on this issue?

At the moment the algorithm does crude randomization ...