alexmy21 / jwpl

Automatically exported from code.google.com/p/jwpl
0 stars 0 forks source link

page.getCategories() method returns hidden categories #83

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Bug was originally reported here: 
http://groups.google.com/group/jwpl/t/97a005ede47ee60f

---

I noticed that the page.getCategories() method returns a lot of
categories that are nor visible on the page.

For example the categories of the article "Germany" include:

Wikipedia indefinitely move-protected pages
All articles containing potentially dated statements
Articles containing German language text
Articles containing potentially dated statements from 2008
Wikipedia semi-protected pages
Articles containing potentially dated statements from November 2009
Articles with dead external links from September 2010
All articles with dead external links
Featured articles
Articles with dead external links from June 2010

All these categories seem rather useless to me. I think it would be
nice if there where a method in the page class that would return only
visible categories instead of all categories.

I also noticed that all these categories share a the parent category
"Hidden categories".
That makes it easy to filter them from the result set. I added this
method in the page class to my local version of jwpl api:

public Set<Category> getVisibleCategories()
{
        Session session = this.wiki.__getHibernateSession();
        session.beginTransaction();
        session.lock(hibernatePage, LockMode.NONE);
        Set<Integer> tmp = new
UnmodifiableArraySet<Integer>(hibernatePage.getCategories());
        session.getTransaction().commit();
        Set<Category> allCategories = new HashSet<Category>();
        for (int pageID : tmp) {
                allCategories.add(wiki.getCategory(pageID));
        }
        Set<Category> result = new HashSet<Category>();
        for(Category category: allCategories)
        {
                Set<Integer> parentIds = category.getParentIDs();
                if(!parentIds.contains(15961454))
                {
                        result.add(category);
                }
        }
        return result;
}

This solution is bad because i hard-coded the pageId of "Hidden
Categories" in the method but maybe you could include a similar and
better method  in the next release of jwpl?

Original issue reported on code.google.com by oliver.ferschke on 1 Mar 2012 at 1:26

GoogleCodeExporter commented 9 years ago
Valid argument. We should do something about that.

Original comment by oliver.ferschke on 1 Mar 2012 at 1:26

GoogleCodeExporter commented 9 years ago
Issue 122 has been merged into this issue.

Original comment by torsten....@gmail.com on 9 Dec 2013 at 9:47