Closed alvaromorales closed 8 years ago
Sure, why not, that could be added. But, for the record, articles_of_class
used to be a fast function (see issue #11).
It takes ~ 4 minutes to fetch 39768 articles with class officeholder
.
I don't think this is a performance issue anymore – it's just that Wikipedia is huge. If I remember correctly, WikithingsDB used to be fast when it only had 678 articles.
Looks like I might need to rebuild the database after adding the lazy='dynamic'
to the relationships if I want to limit the number of WikiClass.page
's returned. I'm trying to do something like this:
result = session.query(WikiClass)\
.filter_by(class_name=w_class)\
.one()\
.page\
.limit(limit)
See http://stackoverflow.com/a/19233187 and http://stackoverflow.com/a/11579347. Otherwise, I get error like 'InstrumentedList' object has no attribute 'limit'
.
@alvaromorales do you see any obvious ways around this?
Using the limit
function is a clean way to do this. You shouldn't need to rebuild the database, you just need to run a migration (schema change). Alembic seems to be the tool of choice.
You're using a list comprehension to return articles in articles_of_class
. I thought we could just use enumerate
in a for
loop to limit the number of articles to return. But SQLAlchemy is actually executing the query against the database, getting all presidents, and then truncating the list to 10. It's still slow -- we need limit
.
We can use WikithingsDB to get a list of articles with a certain class. For example:
This query returns all articles, and may take a long time. It would be nice to add a kwarg to limit the number of articles returned. For example:
cc @TheRealAkhil