Open MightyRufo opened 9 years ago
Oh... heh... I'd love to help fix it but I currently only have a chromebook for development which only has a measly celeron and 120GB of HD space. I'm not sure what could be causing this. Looks like some encoding issue, maybe go is freaking out for some reason and spitting random bytes. I don't really expect to have a resolution for this anytime soon.
Alright, thanks for your input. Overnight I decide to set up my own server and that to store torrents. The only thing I miss is the UI. Your design actually looks nice.
I'm glad you like the design! If you have any ideas on the issue it might help me fix it. It's just a bit tedious without a QA or testing team. I also have lots of other pending tasks on bitcannon as well. Someday I'd like to use it to keep a torrent archive on a server in my closet or something, but I don't have the budget or the space and building a NAS for backups is my top priority. You can tell I'm a bit of a data hoarder, hence this project.
Me and you both, which is why I'm doing this is in the first place. Just in case any major torrent site goes down. I'll still have access to those torrents. I've been collecting for a while now, and I am just getting started ;)
Can you check the database and make sure the torrent titles aren't actually corrupted? Try a program like robomongo or something and see if the data comes out okay. Some of my code is questionable because I wasn't sure if demand for an integrated database (sqlite) would force me to rewrite it all.
Already have that set up, but I haven't actually looked at that. I'm at work at the moment, but I get off in 2 hours. I'll look then and post back. Also, when I try to browse a specific category of torrents, it gives me API error. But cannon says '404 not found'.. Can this be caused by a incompatible dump that I imported?
I'm not really sure. I remember the import system is a bit dumb and just makes assumptions about the format. I think it just checks for the correct number of fields and validates the btih.
And auto-correct is destroying my right now, forgive me.
I see, any way I can send you some data to look at? I know that helps. Just tell me what you need. Looking at it yourself may help.
Basically I need to know whether the data is corrupted or if bitcannon is making it all funky after reading perfectly normal data from the database. So if you can get an info hash that has a garbled title and look at the entry in robomongo then we can see if bitcannon is doing something really weird or not.
Alright, will do.
Weird, I stated MongoDB and went to into Robomongo, but now MongoDB is running an index build. I have to wait. I'm not familiar with Robomongo, how do I get it to show information about what I select on the file tree?
Alright, what am I looking for here? I see bitcannon and under that I see Sysem and Torrents. Double clicking on torrents gives me a tab, which then looks like info hashes. Viewing the list (it only shoes 50 different hashes?), I can expand each one of them and I can see the title. Titles seem to be intact.
Ah, I just realized the part that you showed that was broken is the categories. Are the categories showing up fine as well in robomongo? What bitcannon does is try to get a unique list of categories, then lists those categories on browse. If you imported something with weird data in categories, then a bunch of new categories would get created, possibly flooding the categories page with junk. I think that might be what happened.
How do I view that in robomongo? I see the category section but it doesn't let me view it.
It should just be a field that you can view. I can't really remember exactly how it's stored but it should show something
On Sat, Oct 10, 2015, 12:10 AM MightyRufo notifications@github.com wrote:
How do I view that in robomongo? I see the category section but it doesn't let me view it.
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147034475 .
This is what I'm seeing: http://puu.sh/kEMUB/0e8fb6f8a2.png
Can you expand an entry with the arrow on the left?
On Sat, Oct 10, 2015, 12:24 AM MightyRufo notifications@github.com wrote:
This is what I'm seeing: http://puu.sh/kEMUB/0e8fb6f8a2.png
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147036547 .
Yeah, here: http://puu.sh/kEN4E/73123e549c.png
Do you get a pictures category on your browse page?
On Sat, Oct 10, 2015, 12:28 AM MightyRufo notifications@github.com wrote:
Yeah, here: http://puu.sh/kEN4E/73123e549c.png
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147036760 .
Seems so, yes: http://puu.sh/kENgf/5f532ca965.png
To me it looks like all the garbled categories have only 1 torrent in them. In that case it would mean you imported a bad file.
On Sat, Oct 10, 2015, 12:32 AM MightyRufo notifications@github.com wrote:
Seems so, yes: http://puu.sh/kENgf/5f532ca965.png
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147036889 .
Yup, I imported the openbay database LOL. I think I'll roll the database back and then avoid that dump. What are the 404 errors caused by? I mean obviously they mean 'not found'. But why might that be happening?
I'm not sure yet. I need sleep first
On Sat, Oct 10, 2015, 12:37 AM MightyRufo notifications@github.com wrote:
Yup, I imported the openbay database LOL. I think I'll roll the database back and then avoid that dump. What are the 40 errors caused by? I mean obviously they mean 'not found'. But why might that be happening?
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147036999 .
Alright, well. Thanks for the help so far.
Alright, so. I rolled back my database. I noticed there's a couple torrents that have invalid categories . How do I find them and correct them?
If you click on them and get the btih and do a query with robomongo (Look up some guides) then you should be able to change the category in robomongo.
On Sat, Oct 10, 2015 at 12:50 AM MightyRufo notifications@github.com wrote:
Alright, so. I rolled back my database. I noticed there's a couple torrents that have invalid categories . How do I find them and correct them?
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147037385 .
Hmm, not so sure I can. Clicking on any of the bad categories just brings me to a page showing zero torrents, such as this: http://puu.sh/kG0C4/9bf4dbf8d4.png
That's tough... The only 2 options I see are either we make code in bitcannon that checks the database on startup for weird categories and handles it somehow, or you can try to manually find the offending entries. You might be able to do a query for categories that are longer than 20 characters, but I'm a but rusty on mongo queries
I wouldn't code bitcannon to do that on start-up everytime. I'd code it so I can execute a command.
Well if I remember correctly I set it up to query for all unique values for category then also query each category for a total count, so the information is almost there, I would need to implement the category whitelist so it can print out a warning or do something when there are weird categories.
I'd also be interested in seeing what the data for the offending torrents look like - I wonder if adding stricter btih hash validation would prevent it from happeining.
That would be nice. Last night I imported 7 million torrents. Out of those 7 million, about 10 are messing up the categories page. Otherwise, it works very well!
The problem is that the import system is so dumb. I would love to have something where it prompts you to verify the first torrent's info before running the import, then saves the profile for each auto import source as some kind of format string.
That does sound pretty good. But it also takes time. And coding everything on your own takes even more time.
Just got this from kat's hourly import. I'm guessing I'll have to turn off hourly updates for now.
Are kat dumps fully importing? For me, it seems to skip most of them. And yes, this is what I have too.
@MightyRufo
2015/10/12 15:57:53 [OK!] Starting to import from url:
2015/10/12 15:57:53 https://kat.cr/api/get_dump/hourly/?userhash=USER_HASH_HERE
2015/10/12 15:57:55 [!!!] I was given a URL that doesn't end in .txt or .txt.gz.
2015/10/12 15:57:55 I'll assume it's regular text.
2015/10/12 15:57:55 [OK!] Compression detection complete
2015/10/12 15:57:55 [OK!] Reading initialized
2015/10/12 15:57:58 [OK!] Reading completed
2015/10/12 15:57:58 0 torrents imported
2015/10/12 15:57:58 3866 torrents skipped
2015/10/12 15:57:58 [OK!] Starting to import from url:
2015/10/12 15:57:58 http://www.demonoid.pw/api/demonoid24h.txt.gz
2015/10/12 15:57:59 [OK!] Compression detection complete
2015/10/12 15:57:59 [OK!] GZip detected, unzipping enabled
2015/10/12 15:57:59 [OK!] Reading initialized
2015/10/12 15:57:59 [OK!] Reading completed
2015/10/12 15:57:59 0 torrents imported
2015/10/12 15:57:59 199 torrents skipped
2015/10/12 15:57:59 [OK!] Starting to import from url:
2015/10/12 15:57:59 http://bitsnoop.com/api/latest_tz.php?t=all
2015/10/12 15:58:00 [!!!] I was given a URL that doesn't end in .txt or .txt.gz.
2015/10/12 15:58:00 I'll assume it's regular text.
2015/10/12 15:58:00 [OK!] Compression detection complete
2015/10/12 15:58:00 [OK!] Reading initialized
2015/10/12 15:58:00 [OK!] Reading completed
2015/10/12 15:58:00 9 torrents imported
2015/10/12 15:58:00 16 torrents skipped
2015/10/12 15:58:00 [OK!] Finished auto importing.
How did you find the entry and remove it?
It was a new install of bitcannon so I just opened Robomongo and looked through the results.
Ahh, I have over 7 million torrents imported, most of them work just fine. But I have a few pesky ones that I obviously cannot locate out of 7 million.
Maybe try searching for torrents with a category field longer than 10 - 20 chars? http://docs.mongodb.org/manual/reference/operator/query/size/
funny enough, executing the command in robomongo doesn't bring me any results when specifying any number for size. Either it's not working or I'm not doing it right. http://puu.sh/kHfnw/20f0ae47ed.png
Try this, may take a while. I've been running it on mine with 1.6M records and it's still going after 5 minutes.
db.getCollection('torrents').find({$where:"this.category.length > 20"})
You'll either get the broken records or something like this.
Fetched 0 record(s) in 41779ms
If you get this you need to lower the length by a few and try again.
Right, I never specified what data to look at lmao. One sec, let me run it.
EDIT: Yes, it's taking a bit, BUT it is running.
Brilliant, it just found 14 bad entries. I shall keep this command on hand. Thank you very much kind sir!
*ma'am
You're welcome. @Stephen304 maybe for now this could be added to the wiki under troubleshooting?
Oh! My apologies ma'am. And yes, this should be added to the wiki. I'm sure it can come in handy for many people
Thanks for the useful troubleshooting, I'll add it to the wiki even I get a chance. The weird thing is it doesn't look like the entry has an info hash. I'm not sure how that happened.
On Mon, Oct 12, 2015, 1:59 AM MightyRufo notifications@github.com wrote:
Oh! My apologies ma'am. And yes, this should be added to the wiki. I'm sure it can come in handy for many people
— Reply to this email directly or view it on GitHub https://github.com/Stephen304/bitcannon/issues/80#issuecomment-147297900 .
I have over 5 million torrents imported so far, and I wish to import another 20 million. I'm on a mission to import a full database of.. almost all torrents. Call me crazy! But I have got it working just fine. In fact, it's fast! But the browse page has broken for some reason. Any ideas?