Closed canavandl closed 8 years ago
Cross filter is accessable at:
http://localhost:8084/cross_filter?session=%session args
I usually open the Page Statistics
link then hand-change the url to cross_filter
Now available through the "Page Statistics" button and at /statistics?session=... (whatever the session args are)
ping @yamsgithub for review
I've added server-side caching of the elasticsearch queries, so the interactivity is close to a reasonable speed.
note: I added a functools32 (backport of functools for python2) dependency in order to use lru_cache. You'll have to update your env accordingly (it's in the environment.yml file).
todo:
NoneType has to property something
exception. I probably introduced this bug.)current status:
current status:
ping @yamsgithub for review
current status:
@yamsgithub please take a look at this again.
I see that you are using a call to read the whole index. This will not scale. You should be using elasticsearch aggregations.
def get_plotting_data(index_name, es=None):
if es is None:
es = default_es
res = es.search(index_name, size=100000, fields=["retrieved", "url", "tag", "query"])
fields = []
for item in res['hits']['hits']:
if item['fields'].get('tag') != None:
if item['fields']['tag'][0] == '':
item['fields'].pop('tag')
fields.append(item['fields'])
return fields
I can now see the graphs. But there is still no zoom.
@yamsgithub - the plots don't currently have any zoom tool activated because I didn't think it added much to the visualization. I can add them though - you want the box zoom or wheel zoom? Any other interactions?
It would when the number of queries and tags are large.
The wheel zoom would be more appropriate here.
Added wheel_zoom and reset button:
The buttons are kind of ugly, but there's not a lot that can be done. Alternative, you could remove reset
then have the wheelzoom button hidden but always on. Also, you want the pan tool (click and drag on the plot to move view window)?
Yeah...pan tool would be useful. This would also be useful on the page clustering window.
The plot of pages downloaded over time does not have the actual date on the plot.
So the other things we mentioned:
Making the text bounding box transparent. Add 'Help' button that pops a text box where we can add all the instructions and features.
@yamsgithub
Updates:
I wasn't able to reproduce your reset button issues in my Ubuntu VM on Chrome. If you pull the recent commits onto your branch and still have to issue let me know. Also please check your console to see if any helpful error messages are being logged.
The last change I've got to make is to add a callback onto the datetime picker widgets so that it fires on change like the tables. It's taking me a minute to figure it out, but I'll figure it out.
@canavandl
I just tested the changes. Most are fine. Here are a few comments:
I am also seeing this strange issue. So I make a new web query and then click on the page stats tab. I get the following error:
But if I restart the ddt server then I no longer see this error and all is working fine!
OK...so I now see the date/time but the time is not local time.
I moved the help hint into the nav bar on the far right and fixed the timeseries/local timestamp issue.
todo:
ping @yamsgithub
I believe I have resolved all of the issues/comments. Pls review when you have time.
@canavandl
The following errors still exist:
@yamsgithub
I sent an email about this:
I think the issue is that we're checking if different query results have domains in common, not specific pages. So query_A which returns nytimes.com/news_article_A would be linked to query_B's result of nytimes.com/news_article_B.
Is it your desire for the links to be for specific pages and not only domains? (This is likely due to me not understanding the web scrapping domain very well) If so, it's a quick 2 line fix.
@canavandl
With the latest changes I see no links between the nodes that definitely have pages in common. There are no links between any nodes!