NAMD / pypln.api

Python library to access PyPLN's API.
GNU General Public License v3.0
4 stars 3 forks source link

Fetching all the properties at once #40

Open fccoelho opened 9 years ago

fccoelho commented 9 years ago

currently when we want to acess properties of a PyPLN documents, we need to fetch each on a separate request. This impractical.

There should be an diferent url to fetch the properties data.

For example today when we point the browser to a document properties url, such as this http://fgv.pypln.org/documents/57894/properties/ we get back a json object with an array of property urls for that document:

{
    "properties": [
        "http://fgv.pypln.org/documents/57894/properties/average_sentence_length/",
        "http://fgv.pypln.org/documents/57894/properties/average_sentence_repertoire/",
        "http://fgv.pypln.org/documents/57894/properties/contents/",
        "http://fgv.pypln.org/documents/57894/properties/file_id/",
        "http://fgv.pypln.org/documents/57894/properties/file_metadata/",
        "http://fgv.pypln.org/documents/57894/properties/filename/",
        "http://fgv.pypln.org/documents/57894/properties/forced_decoding/",
        "http://fgv.pypln.org/documents/57894/properties/freqdist/",
        "http://fgv.pypln.org/documents/57894/properties/language/",
        "http://fgv.pypln.org/documents/57894/properties/lemmas/",
        "http://fgv.pypln.org/documents/57894/properties/length/",
        "http://fgv.pypln.org/documents/57894/properties/md5/",
        "http://fgv.pypln.org/documents/57894/properties/mimetype/",
        "http://fgv.pypln.org/documents/57894/properties/momentum_1/",
        "http://fgv.pypln.org/documents/57894/properties/momentum_2/",
        "http://fgv.pypln.org/documents/57894/properties/momentum_3/",
        "http://fgv.pypln.org/documents/57894/properties/momentum_4/",
        "http://fgv.pypln.org/documents/57894/properties/noun_phrases/",
        "http://fgv.pypln.org/documents/57894/properties/palavras_raw/",
        "http://fgv.pypln.org/documents/57894/properties/palavras_raw_ran/",
        "http://fgv.pypln.org/documents/57894/properties/pos/",
        "http://fgv.pypln.org/documents/57894/properties/repertoire/",
        "http://fgv.pypln.org/documents/57894/properties/semantic_tags/",
        "http://fgv.pypln.org/documents/57894/properties/sentences/",
        "http://fgv.pypln.org/documents/57894/properties/tagset/",
        "http://fgv.pypln.org/documents/57894/properties/text/",
        "http://fgv.pypln.org/documents/57894/properties/tokens/",
        "http://fgv.pypln.org/documents/57894/properties/upload_date/",
        "http://fgv.pypln.org/documents/57894/properties/wordcloud/"
    ]
}

I propose we add a new API endpoint which could have the form of either:

http://fgv.pypln.org/documents/57894/properties_data/

or

http://fgv.pypln.org/documents/57894/properties/gzip

which would return a gzipped JSON with all the data.

flavioamieiro commented 9 years ago

I agree this is important, specially for our use of PyPLN in mediacloud, but not only for that. I think it will be very useful for everyone.

I would propose /documents/<id>/properties/all or /documents/<id>/properties/all_data/ as the endpoint, just so it would still be under /documents/<id>/properties/.

flavioamieiro commented 9 years ago

I just realized this is a pypln.web issue. It will involve changes to the REST API itself. Since those changes must be followed by pypln.api, I'll leave this issue open (but if it is ok with you @fccoelho , I'll edit it to reflect this).

I copied the content of the original issue here to the new one in NAMD/pypln.web#134