|PyPI version| |Build Status| |Codecov branch| |Coverage Status| |Code Health| |Dependency Status|
If you want to create an alternative twitter search portal, the only way would be to use the official twitter API to retrieve Tweets. But that interface needs an OAuth account and it makes your search portal completely dependent on Twitters goodwill. The alternative is, to scrape the tweets from the twitter html search result pages, but Twitter may still lock you out on your IP address. To circumvent this, you need many clients accessing twitter to scrape search results. This makes it neccessary to create a distributed peer-to-peer network of twitter scrapers which can all organize, store and index tweets. This solution was created with loklak.
What is Loklak ? ^^^^^^^^^^^^^^^^
It is a server application which is able to collect messages from various sources, including twitter. The server contains a search index and a peer-to-peer index sharing interface.
Why should I use this ? ^^^^^^^^^^^^^^^^^^^^^^^
If you like to be anonymous when searching things, want to archive tweets or messages about specific topics and if you are looking for a tool to create statistics about tweet topics, then you may consider loklak. With loklak you can do: - collect and store a very, very large amount of tweets and similar messages - create your own search engine for tweets - omit authentication enforcment for API requests on the twitter plattform - share tweets and tweet archives with other loklak users - search anonymously on your own search portal - create your own tweet search portal or statistical evaluations - use Kibana to analyze large amounts of tweets as source for statistical data.
To use the loklak app, first an object of the loklak type needs to be
created. Do the following to install the pip
module and add it to
your requirements
for the application.
Currently our pip package is supported for python2.7 and lower venv versions.
Will be supported in python3 version soon!
pip install python-loklak-api
To use the GUI client for API, install wxPython
from
here <http://www.wxpython.org/download.php>
and then install
Gooey <https://github.com/chriskiehl/Gooey>
pip
module.
pip install Gooey
Then add
::
from gooey import Gooey
@Gooey
before def main():
in /bin/loklak
Loklak once installed, can be used in the application as
from loklak import Loklak
To create a loklak object you can assign the Loklak()
object to a
variable. variable = Loklak()
eg. l = Loklak()
This creates an objects whose backend loklak server
is http://loklak.org/
| If you want to set this API to use your own server, you can now define
it by doing
| l = Loklak('http://192.168.192.5:9000/')
for example or pass a URL
to it as
| l = Loklak('http://loklak-super-cluster.mybluemix.net/')
Note the trailing /
is important and so is http://
API Documentation
Status of the Loklak server
'''''''''''''''''''''''''''
Using the object created above, ``l.status()`` returns a json of the
status as follows
.. code:: json
{
"system": {
"assigned_memory": 2051014656,
"used_memory": 1374976920,
"available_memory": 676037736,
"cores": 8,
"threads": 97,
"runtime": 734949,
"time_to_restart": 85665051,
"load_system_average": 18.19,
"load_system_cpu": 0.24344589731081373,
"load_process_cpu": 0.018707976134026073,
"server_threads": 68
},
"index": {
"mps": 176,
"messages": {
"size": 1195277012,
"size_local": 1195277012,
"size_backend": 0,
"stats": {
"name": "messages",
"object_cache": {
"update": 51188,
"hit": 1796,
"miss": 139470,
"size": 10001,
"maxsize": 10000
},
"exist_cache": {
"update": 68419,
"hit": 2450,
"miss": 137020,
"size": 68313,
"maxsize": 3000000
},
"index": {
"exist": 68634,
"get": 0,
"write": 51016
}
},
"queue": {
"size": 100000,
"maxSize": 100000,
"clients": 72
}
},
"users": {
"size": 65915082,
"size_local": 65915082,
"size_backend": 0,
"stats": {
"name": "users",
"object_cache": {
"update": 51827,
"hit": 3756,
"miss": 639,
"size": 10000,
"maxsize": 10000
},
"exist_cache": {
"update": 56222,
"hit": 0,
"miss": 0,
"size": 15933,
"maxsize": 3000000
},
"index": {
"exist": 0,
"get": 639,
"write": 51016
}
}
},
"queries": {
"size": 4251,
"stats": {
"name": "queries",
"object_cache": {
"update": 452,
"hit": 132,
"miss": 3297,
"size": 160,
"maxsize": 10000
},
"exist_cache": {
"update": 3703,
"hit": 162,
"miss": 2959,
"size": 3002,
"maxsize": 3000000
},
"index": {
"exist": 2959,
"get": 176,
"write": 292
}
}
},
"accounts": {"size": 96},
"user": {"size": 790137},
"followers": {"size": 146},
"following": {"size": 135}
},
"client_info": {
"RemoteHost": "103.43.112.99",
"IsLocalhost": "false",
"request_header": {
"Cookie": "__utma=156806566.949140694.1455798901.1455798901.1455798901.1; __utmc=156806566; __utmz=156806566.1455798901.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Upgrade-Insecure-Requests": "1",
"X-Forwarded-Proto": "http",
"Connection": "close",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36",
"X-Forwarded-For": "103.43.112.99",
"Host": "loklak.org",
"Accept-Encoding": "gzip, deflate, sdch",
"Accept-Language": "en-US,en;q=0.8",
"X-Real-IP": "103.43.112.99"
}
}
}
Settings of the loklak server (strictly only for localhost clients)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Using the class method ``settings()`` to returns a json of the settings
being used by the loklak server
Hello test - Check if the server is responding properly and is online
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Using the object created above ``l.hello()`` returns a json response of
the server status
When the server is online, the json should read
.. code:: json
{"status": "ok"}
Peers - API To find out the loklak peers
''''''''''''''''''''''''''''''''''''''''
Finding the list of loklak peers, use the object created above
``l.peers()`` which returns a json response containing all the peers
connected to ``loklak.org``
Users API
'''''''''
What this can do ?
- Fetch the details of one user
- Fetch the details of the user along with number of their followes and
following
- Fetch only the followers / following of a particular user
Query Structure:
``l.user(<username>, <followers count>, <following count>)``
| ``<username>`` is a string, e.g. ``'loklak_app'``
| ``<followers count>`` and ``<following count>`` is a numeric or a
string or ``None``
| e.g.
| 1. ``l.user('loklak_app')``
| 2. ``l.user('loklak_app', 1000)`` - 1000 followers of ``loklak_app``
| 3. ``l.user('loklak_app', 1000, 1000)`` - 1000 followers and following
of ``loklak_app``
| 4. ``l.user('loklak_app', None, 1000)`` - 1000 following of
``loklak_app``
Accounts API
''''''''''''
LOCALHOST ONLY, Loklak server running on port ``localhost:9000``
To query the user account details of the data within the loklak server,
use ``l.account('name')`` where ``'name'`` is the screen\_name of the
user whose information is required.
To update the user details within the server, package a ``json`` object
with the following parameters and other parameters which needs to be
pushed to the server and use the ``action=update`` where ``action`` is
the 2nd parameter of the ``account()`` api
``l.account('name', 'update', '{ json object }')``
Search API
''''''''''
Public search API for the scraped tweets from Twitter.
Query structure:
``search('querycontent', 'since date', 'until date', 'from a specific user', '# of tweets')``
e.g. ``l.search('doctor who')``
A search result in json looks as follows.
.. code:: json
{
"search_metadata" : {
"itemsPerPage" : "100",
"count" : "100",
"count_twitter_all" : 0,
"count_twitter_new" : 100,
"count_backend" : 0,
"count_cache" : 97969,
"hits" : 97969,
"period" : 18422,
"query" : "doctor who",
"client" : "103.43.112.99",
"time" : 4834,
"servicereduction" : "false",
"scraperInfo" : "http://kaskelix.de:9000,local"
},
"statuses" : [ {
"created_at" : "2015-03-03T19:30:43.000Z",
"screen_name" : "exanonym77s",
"text" : "check #DoctorWho forums #TheDayOfTheDoctor #TheMaster @0rb1t3r http://www.thedoctorwhoforum.com/ https://pic.twitter.com/FvW6J9WMCw",
"link" : "https://twitter.com/ronakpw/status/572841550834737152",
"id_str" : "572841550834737152",
"source_type" : "TWITTER",
"provider_type" : "SCRAPED",
"retweet_count" : 0,
"favourites_count" : 0,
"hosts" : [ "www.thedoctorwhoforum.com", "pic.twitter.com" ],
"hosts_count" : 2,
"links" : [ "http://www.thedoctorwhoforum.com/", "https://pic.twitter.com/FvW6J9WMCw" ],
"links_count" : 2,
"mentions" : [ "@0rb1t3r" ],
"mentions_count" : 1,
"hashtags" : [ "DoctorWho", "TheDayOfTheDoctor", "TheMaster" ],
"hashtags_count" : 3,
"without_l_len" : 62,
"without_lu_len" : 62,
"without_luh_len" : 21,
"user" : {
"name" : "Example User Anyone",
"screen_name" : "exanonym77s",
"profile_image_url_https" : "https://pbs.twimg.com/profile_images/567071565473267713/4hiyjKkF_bigger.jpeg",
"appearance_first" : "2015-03-03T19:31:30.269Z",
"appearance_latest" : "2015-03-03T19:31:30.269Z"
}
}, ...
]
}
Mentioning the Since and Until dates
e.g. ``l.search('sudheesh001', '2015-01-10', '2015-01-21')``
Which results in a json as follows
.. code:: json
{
"search_metadata" : {
"itemsPerPage" : "100",
"count" : "100",
"count_twitter_all" : 0,
"count_twitter_new" : 100,
"count_backend" : 0,
"count_cache" : 97969,
"hits" : 97969,
"period" : 18422,
"query" : "doctor who",
"client" : "103.43.112.99",
"time" : 4834,
"servicereduction" : "false",
"scraperInfo" : "http://kaskelix.de:9000,local"
},
"statuses" : [ {
"timestamp" : "2016-05-11T16:53:46.615Z",
"created_at" : "2016-05-11T16:52:59.000Z",
"screen_name" : "BelleRinger1",
"text" : "I would love to see http://www.cultbox.co.uk/?p=53662",
"link" : "https://twitter.com/BelleRinger1/status/730440578031190016",
"id_str" : "730440578031190016",
"source_type" : "TWITTER",
"provider_type" : "SCRAPED",
"retweet_count" : 0,
"favourites_count" : 0,
"images" : [ ],
"images_count" : 0,
"audio" : [ ],
"audio_count" : 0,
"videos" : [ ],
"videos_count" : 0,
"place_name" : "",
"place_id" : "",
"place_context" : "ABOUT",
"hosts" : [ "www.cultbox.co.uk" ],
"hosts_count" : 1,
"links" : [ "http://www.cultbox.co.uk/?p=53662" ],
"links_count" : 1,
"mentions" : [ ],
"mentions_count" : 0,
"hashtags" : [ ],
"hashtags_count" : 0,
"classifier_language" : "english",
"classifier_language_probability" : 6.95489E-8,
"without_l_len" : 19,
"without_lu_len" : 19,
"without_luh_len" : 19,
"user" : {
"screen_name" : "BelleRinger1",
"user_id" : "2497345790",
"name" : "Belle Gaudreau",
"profile_image_url_https" : "https://pbs.twimg.com/profile_images/723262970805907456/RbMnyEqs_bigger.jpg",
"appearance_first" : "2016-05-11T16:53:46.615Z",
"appearance_latest" : "2016-05-11T16:53:46.615Z"
}
}, ...
]
}
Valid parameters for ``since`` and ``until`` can also be ``None`` or any
``YMD`` date format. Looking towards the future releases to resolve this
to any date format.
The ``from a specific user`` parameter makes sure that the results
obtained for the given query are only from a specific user.
e.g. ``l.search('doctor who', '2015-01-10', '2015-01-21','0rb1t3r')``
The ``# of tweets`` parameter is how many tweets will be returned.
e.g. ``l.search('avengers', None, None, 'Iron_Man', 3)``
Aggregations API
''''''''''''''''
GeoLocation API
'''''''''''''''
Loklak allows you to fetch required information about a country or city.
e.g. ``l.geocode(['Barcelona'])``, ``l.geocode(['place1', 'place2'])``
.. |PyPI version| image:: https://badge.fury.io/py/python-loklak-api.svg
:target: https://badge.fury.io/py/python-loklak-api
.. |Build Status| image:: https://travis-ci.org/loklak/loklak_python_api.svg?branch=master
:target: https://travis-ci.org/loklak/loklak_python_api
.. |Codecov branch| image:: https://img.shields.io/codecov/c/github/loklak/loklak_python_api/master.svg?style=flat-square&label=Codecov+Coverage
:target: https://codecov.io/gh/loklak/loklak_python_api
.. |Coverage Status| image:: https://coveralls.io/repos/github/loklak/loklak_python_api/badge.svg?branch=master
:target: https://coveralls.io/github/loklak/loklak_python_api?branch=master
.. |Code Health| image:: https://landscape.io/github/loklak/loklak_python_api/master/landscape.svg?style=flat
:target: https://landscape.io/github/loklak/loklak_python_api/master
.. |Dependency Status| image:: https://gemnasium.com/badges/github.com/loklak/loklak_python_api.svg
:target: https://gemnasium.com/github.com/loklak/loklak_python_api