Open uxian opened 11 years ago
Er.. I'll sort out the logic part. When I default ranking to "no", I did not fully check whether all components of SRFE can work. When things are right, it should not require pymmseg (or anything related to Feature extraction) to run SRFE.
For your quick fix, can you try using his setup.py build
, then the _mmseg
which is compiled as a dynamic library will be in _build
folder. In Linux, a "*.so" will be created. I don't know how about windows. In my environment, I used his setup.py install --user
to install the module...
Can you try the latest snsrouter (with submodule "snsapi" updated). I fixed the two config sections:
I tested it on a fresher env. Now the FE without ranking should be OK....
@uxian
report a tiny bug, in Windows, the following third method is not work, but the second one (without sharp) works good.
in snsapi/utils.py #line 135
def utc2str(u):
#return str(datetime.datetime.fromtimestamp(u))
return _format_date(datetime.datetime.utcfromtimestamp(u))
#return _format_date(datetime.datetime.fromtimestamp(u, tz.tzlocal()))
Following lines are trackback. you catched this Exception before, I just commented out the "try/catch" to see why I got 0 new messages when I open 127.0.0.1/home_timeline.
File "snsapi\snsapi\utils.py", line 138, in utc2str
return _format_date(datetime.datetime.fromtimestamp(u, tz.tzlocal()))
File "C:\Python27\lib\site-packages\dateutil\tz.py", line 92, in utcoffset
if self._isdst(dt):
File "C:\Python27\lib\site-packages\dateutil\tz.py", line 135, in _isdst
return time.localtime(timestamp+time.timezone).tm_isdst
ValueError: (22, 'Invalid argument')
Now, after modified function utc2str, I can see 16 messages from rss of yours. lol
I still confused how can I auth my SinaWeiboStatus? Write a script to fetch token? Or use snscli to get a xxx.token, and move it to sns-router dir?
I forget to document the the configurations, you can do something like this: (in channel.json)
{
"platform": "SinaWeiboStatus",
"methods": "home_timeline,update,forward",
"user_id": "",
"user_name": "snsapi_test",
"channel_name": "sina_account_1",
"auth_info": {
"save_token_file": "(default)",
"callback_url": "http://127.0.0.1:8080/auth/second/",
"cmd_fetch_code": "(local_webserver)",
"cmd_request_url": "(default)"
},
"app_secret": "",
"open": "yes",
"app_key": "",
"home_timeline":{
"count": 100
}
},
Where it matters is the "callback_url" part. SRFE will intercept request_url and fetch_code methods of SNSBase. The above callback_url is the point to give authed code to SRFE. You may want to change the IP and port according to your srfe.conf
.
With this configured. You can accomplish auth flow from "config" page.
Besides, you can acquire those ".save" files from snscli and put them under SRFE.
use callback_url and is cooler, heihei.
I tested sns-router on Mac OS X 10.8.1, works perfect! I will take some time to deal with bugs on Windows env.
If I use 'http://127.0.0.1:8080/auth/second/' as sinaWeibo's callback_url, how do configure it on open.weibo.com? I tried 'http://127.0.0.1:8080/', but not acceptable, 'http://127.0.0.1/' is accepted, but error happened when auth.
I think http://127.0.0.1:8080/auth/second/
should work. That's what I put there. I think the reason it does not accept is due to missing "unauth" callback url, for which I put http://vipc3.ie.cuhk.edu.hk:8080/unauth
as a fake entry...
I added you to the collaborator of sns-router. Feel free to create new (issue) branch containing fixes on your environment. When finished, just drop me a message to pull it to "dev" branch. I can do some cross-checking and follow what's going on. I'll also brief you changes in the same way. @uxian
@uxian , there may be other problems arising from channel configuration. I put my configuration in the following page:
https://github.com/hupili/sns-router/wiki/Channel-Configuration
Some notice and explanations will be added. e.g. renren does not accept 127.* address so I use the IE server as a bouncer.
/wiki/Channel-Configuration is sweet.
Sina weibo now do not support callback_url like "http://127.0.0.1:8080/auth/second/", no port is allowed, at least I can't set like that. I suggest you do not to change your callback_url, or you can not set it back.
I also tested sns-router on my linux server, works good, except that it's a little bit hard to set ranking to "yes". I will keep on trying
Er? That's strange.. I just configured the port included callback url less than one month ago... I tried to search for official announcement but found nothing. There are some questions on the Internet but not answers.
Briefly note some ways to work around this:
srfe.json
to use standard port like 80. ssh -L80:127.0.0.1:8080 username@localhost
to forward the port. https://snsapi.ie.cuhk.edu.hk/bouncer/11111/?{parameters returned}
; it will be redirected to https://localhost:11111/auth/second/?{parameters returned}
Cool! I just find an idle server to run sns-router on port 80, that will be your first way. The rest two way are so cool!
@uxian I did not change my current callback_url for test (I'm afraid it won't get back..). I'm just thinking whether Sina is filtering other patters, like "127.0.0.1". So, would those equivalences of http://127.0.0.1:8080/auth/second/
work?
http://localhost:8080/auth/second/
http://127.0.0.111:8080/auth/second/
Anyway, I think a universal bouncer is needed. e.g. on Renren, the callback_url must be something reachable from the public Internet...
No, none of them works, more bad example :
http://localhost/auth/second/
http://www.snsrouter.com:8000/auth/second/
http://www.snsrouter.com:8000/auth.php
what works:
http://127.0.0.1/au.php
http://127.0.0.1/auth/second/
I think their rules are:
universal bouncer is a great and useful idea!
I deployed a simple bouncer:
Test urls:
https://snsapi.ie.cuhk.edu.hk/aux/bouncer/redir/localhost/8080/?code=testcode
https://snsapi.ie.cuhk.edu.hk/aux/bouncer/redir/127.0.0.1/8080/?code=testcode
So one can configure its callback url to be:
https://snsapi.ie.cuhk.edu.hk/aux/bouncer/redir/127.0.0.1/8080/?
or
https://snsapi.ie.cuhk.edu.hk/aux/bouncer/redir/127.0.0.1/8080/
"?" is dependent on the OSN's convention.
The target address is restricted to be localhosts but port is free of choice.
Code is in the snsapi-website repo:
https://github.com/hupili/snsapi-website/tree/master/aux/bouncer
If you have other deployment experience, you can collect them here
https://github.com/hupili/sns-router/wiki/System-Deployment-Case-Study
@uxian
Just restructured the feature extraction part. Now features can be enabled from autoweight.json
. autoweight.json.example
should be able to run directly. Currently, only topic
depends on pymmseg. We can experience other features with the whole flow now.
"operation" is added to the frontend with some brief explanations. You can just execute them sequentially.
Hope it get through this time~
The way queue.py
access training logic is very kludgery. It's just an assemble of the codes in analysis
, in which I dumped more than needed data for offline analysis. Later I will cut off non-essential works and make the new functions more clear.
The latest code is on dev
.
When I switched to dev, updated autoweight.josn and open queue.json to 'yes'. I got this error.
Traceback (most recent call last):
File "srfe.py", line 47, in <module>
q = SRFEQueue(sp)
File "/home/lijunbo/Github/sns-router/queue.py", line 51, in __init__
from ranking import score
File "/home/lijunbo/Github/sns-router/ranking/score.py", line 21, in <module>
from feature import Feature
File "/home/lijunbo/Github/sns-router/ranking/feature.py", line 23, in <module>
from wordseg import wordseg_clean
File "/home/lijunbo/Github/sns-router/ranking/wordseg.py", line 17, in <module>
mmseg.Dictionary.load_dictionaries()
File "/usr/lib64/python2.6/site-packages/pymmseg_cpp-1.0.0-py2.6-linux-x86_64.egg/mmseg/__init__.py", line 20, in load_dictionaries
raise IOError("Cannot open '%s'" % d)
IOError: Cannot open 'kdb/words.merged.dic'
So I commented out from wordseg import wordseg_clean
, and got this
Traceback (most recent call last):
File "srfe.py", line 47, in <module>
q = SRFEQueue(sp)
File "/home/lijunbo/Github/sns-router/queue.py", line 53, in __init__
self.score = score.Score()
File "/home/lijunbo/Github/sns-router/ranking/score.py", line 29, in __init__
self.load_weight(fn_weight)
File "/home/lijunbo/Github/sns-router/ranking/score.py", line 34, in load_weight
self.feature_weight = json.loads(open(fn, 'r').read())
IOError: [Errno 2] No such file or directory: 'conf/weights.json'
So I created conf/weights.json
and write {}
to it, at last I can run srfe.py with queue.py
:D
fixed:
{}
. Do you experience errors when using the "Operation" panel? (which then creates useful 'weights.json')
Actually, I open 'http://127.0.0.1:8080/config', Feature Weight
table and Tags
table are empty. I am sure that queue.py
is yes
, and autoweight.json
is the default one, which contains 12 preferences and 4 features.
I found no errors or exceptions about this, but I will try to figure it out, since this may due to environment issues.
When I click Prepare Training Data
in http://127.0.0.1:8080/operation
, an error was thrown.
Traceback (most recent call last):
File "bottle/bottle.py", line 763, in _handle
return route.call(**args)
File "bottle/bottle.py", line 1572, in wrapper
rv = callback(*a, **ka)
File "bottle/bottle.py", line 3132, in wrapper
result = func(*args, **kwargs)
File "srfe.py", line 83, in wrapper_check_login
return func(*al, **ad)
File "srfe.py", line 153, in operation_prepare_training_data
re = q.prepare_training_data()
File "/home/lijunbo/Github/sns-router/queue.py", line 631, in prepare_training_data
from analysis.select_samples import select_samples
File "/home/lijunbo/Github/sns-router/analysis/select_samples.py", line 19, in <module>
from feature import Feature
File "/home/lijunbo/Github/sns-router/analysis/feature.py", line 22, in <module>
from wordseg import wordseg_clean
File "/home/lijunbo/Github/sns-router/analysis/wordseg.py", line 17, in <module>
mmseg.Dictionary.load_dictionaries()
File "/usr/lib64/python2.6/site-packages/pymmseg_cpp-1.0.0-py2.6-linux-x86_64.egg/mmseg/__init__.py", line 20, in load_dictionaries
raise IOError("Cannot open '%s'" % d)
IOError: Cannot open 'kdb/words.merged.dic'
wow, sns-router is so desperate for words.dic :D
@uxian , I see. Old logic in "analysis" is not decoupled yet. I should handle the "Operations" related ones first.
For the "tags" table under "config", can you add your own tags using the button under the same headline? By default, there is no tags and user define tags according to their own criteria.
Also, I just realized that I can make one words.merged.dic here for you to download... Anyway, users will get the same dict if they operate in the same way... One can prepare his own wordseg dict...
Another issue I can forecast is the encoding issue for wordseg related operations. The pymmseg module assumes utf-8 encoding. On other platforms, transcoding is needed before feeding the message into pymmseg.
I can add "tag" in config page, but I cannot set relations between tags (assign "father"). And where does user added tags stored in? Seems not in autoweight.conf
.
Sorry for the confusion of "parent". It's not implemented in the backend yet. I added the desired function description in a new issue.
The tags are stored in srfe_queue.db
. Three tables, msg
, tag
, msg_tag
are layed-out in usual way. "msg_tag" table only makes sense with the existence of "tag" table. Ideally, some json-confs should also be moved into this sqlite db, so that users only need to take this single file wherever they move their Router.
sqlite> select * from tag;
1|null|0|
2|mark|1|
3|gold|1|
4|silver|1|
5|bronze|1|
6|news|1|
7|interesting|1|
8|shit|0|
9|nonsense|1|
10|text|0|
11|tech|1|
I see. Here is another dependency, './kdb/tdict.pickle', when I click 'Prepare Training Data', this error prompt...
Traceback (most recent call last):
File "bottle/bottle.py", line 763, in _handle
return route.call(**args)
File "bottle/bottle.py", line 1572, in wrapper
rv = callback(*a, **ka)
File "bottle/bottle.py", line 3132, in wrapper
result = func(*args, **kwargs)
File "srfe.py", line 83, in wrapper_check_login
return func(*al, **ad)
File "srfe.py", line 153, in operation_prepare_training_data
re = q.prepare_training_data()
File "/home/lijunbo/Github/sns-router/queue.py", line 631, in prepare_training_data
from analysis.select_samples import select_samples
File "/home/lijunbo/Github/sns-router/analysis/select_samples.py", line 19, in <module>
from feature import Feature
File "/home/lijunbo/Github/sns-router/analysis/feature.py", line 247, in <module>
class Feature(object):
File "/home/lijunbo/Github/sns-router/analysis/feature.py", line 259, in Feature
feature_extractors.append(FeatureTopic(env))
File "/home/lijunbo/Github/sns-router/analysis/feature.py", line 162, in __init__
self.tdict = Serialize.loads(open(fn_tdict).read())
IOError: [Errno 2] No such file or directory: './kdb/tdict.pickle'
pymmseg is not working on my Windows.
The story start from here:
1.'ranking' is default to 'no' in queue.py, then I cannot open config page, and I got this
2.So I check queue.py, find that if 'ranking' is not yes, q.score will be None
3.Then, I set 'ranking' to 'yes' in queue.py, but when I run srfe.py, error happened
4.Then, I try to put following line to srfe.py
5.I got this:
6.So, it turns out that pymmseg is not working on my computer. I tried to install pymmseg, but its setup.py did not work.
T T
Can you find a way to solve this? @hupili