Closed gyachdav closed 8 years ago
@gyachdav I was on my way back to Munchen, now I will take a look.
@gyachdav @sacdallago I saw the wrong entries.. Should we provide "removeById" functionality in the api?
can you investigate how they ended up as characters in the first place? Would be great if we can correct the scraper and refill the db with a corrected list.
yes, I already work on it.
Good. I don't think that removing items would do any good, as the policy of refilling would later put them in again. As @gyachdav suggested, it's better to solve the problem at the root, that is: the wiki. I know it's a big effort, but let's be consistent and let's hope the wiki gets better and better.
What needs to be inspected is the getAllNames function of the characters scraper. This is later used to determine the wiki pages to scrape. There seems to be picked up too much. I am not sure, if i have time do this, because of death in my girlfriend´s family.
@theocheslerean
@Adiolis I do it right now.
@boriside The problem is that the scraper is buggy as f**ck. Some characters are even not scraped... Many properties not scraped at all.
@Adiolis I'm sorry to hear about the loss :disappointed: I hope you and your girlfriend are keeping up.
Let some other people take some of your workload, @boriside thanks for doing that already. I'm also talking about the "page rank" issue.
@Adiolis It's not because the scrapper, but because the inconsistent information in the wiki. SO I think the best approach would be to exclude the exceptions. I also sorry about the loss, take your time. :(
Thank you both. It happend two days ago =/. I am taking my time but this is just a quick fix of 5 minutes... Feel free to edit my solution, whatever.
The scraper was taking all links instead only the first.
@boriside excluding data is a bold move :) I wouldn't do that, personally. Incorrect data is better than eventually no data. Wikis are a joint effort, thus mistakes are acceptable and encourage users to curate data.
@Adiolis should I run an update on characters?
I fixed the bug, that the scraping could not terminate because of the skipping of the houses. Now, everything should be fine.
Thx
the character list still contains house names. please clean.