isohuntto / openbay-db-dump

GNU General Public License v3.0
110 stars 39 forks source link

Schema ? #1

Open rakoo opened 9 years ago

rakoo commented 9 years ago

Here's the first line of the dump:

"Tha Twilight New Moon DVDrip 2009 XviD-AMiABLE"|694554360|2cae2fc76d110f35917d5d069282afd8335bc306|0|movies|0|1

What is the format of lines ? I'd easily guess the 3 first columns and the movies tag, but I wonder about the 3 other numbers.

NewEraCracker commented 9 years ago

I've been playing around with this commands to import it into a MariaDB database.

LOAD DATA INFILE 'torrents_mini.csv' INTO TABLE piratebay.torrents
FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
(name, size, hash, downloads_count, tags, seeders, leechers);
UPDATE piratebay.torrents SET description='' WHERE description IS NULL;
UPDATE piratebay.torrents SET category_id=1 WHERE tags='anime';
UPDATE piratebay.torrents SET category_id=2 WHERE tags='software';
UPDATE piratebay.torrents SET category_id=3 WHERE tags='games';
UPDATE piratebay.torrents SET category_id=4 WHERE tags='adult';
UPDATE piratebay.torrents SET category_id=5 WHERE tags='movies';
UPDATE piratebay.torrents SET category_id=7 WHERE tags='other';
UPDATE piratebay.torrents SET category_id=6 WHERE tags='music';
UPDATE piratebay.torrents SET category_id=8 WHERE tags='series & tv';
UPDATE piratebay.torrents SET category_id=9 WHERE tags='books';

Still I'd like an official reply from the maintainers.

hutchy1990 commented 9 years ago

My guess is the best way to do this is with a php file get it to put it in the correct order then add it to your db i am going to try do this but it will probs take me a few days as i am a noob but google tutorials will help haha

milezzz commented 9 years ago

I think the first number is the UNIX timestamp. 694554360 Is equivalent to:

01/04/1992 @ 7:46pm (UTC)

Hmm but if that's true the timestamp seems wrong. The 2nd number is the info_hash: 2cae2fc76d110f35917d5d069282afd8335bc306

hutchy1990 commented 9 years ago

deffo ment to be the time stamp cant be the number off download no website got 694 million people to download twilight

NewEraCracker commented 9 years ago

694554360 or 662.38 MiB. Looks like the size of a common DVDrip release.

hutchy1990 commented 9 years ago

thats it :+1: has anyone managed to make this db work with openbay I am going to work on a user system with comments and uploads before i try get this db working

LucasRoot commented 9 years ago

The other numbers, |0|movies|0|1, can be: Files, Seeds and Leechers.

rakoo commented 9 years ago

@LucasRoot I started with that too, but a downloads_count as @NewEraCracker said seems much more likely than Files (a torrent can't have 0 files). Let's see what the official answer will be !

LucasRoot commented 9 years ago

I think won't have a official answer

milezzz commented 9 years ago

Good job guys! So the column mapping is:

"Tha Twilight New Moon DVDrip 2009 XviD-AMiABLE"|694554360|2cae2fc76d110f35917d5d069282afd8335bc306|0|movies|0|1

name | size | hash | downloads_count | category_id* | seeders | leechers

*Please note that category_id needs to be an INT which will need to be mapped to the appropriate category.

LucasRoot commented 9 years ago

And the size is in Byte

hutchy1990 commented 9 years ago

Ok so the best way to tackle this is php in my opinion.

Get php to load in the csv file loading each peice of data in sorting it into the correct order and uploading it to MySQL this is going to take quite a bit off power since the file is so large this would prob take someone an hour to knock up the script and hours to run the script if the page does not time out then my theory is fucked and a new method will need to be tried. I could prob do this but will take me hours if not days because I will have to google every aspect like how to load cvs file into php but it is possible and really easy for experienced php programmers

LucasRoot commented 9 years ago

@hutchy1990 I split the file and use the example of @NewEraCracker

rakoo commented 9 years ago

Thanks @milezteg!

I used SQLite (more than enough for my personal needs), in which you can use the import command:

CREATE TABLE torrents (
  name TEXT,
  size DATE,
  hash TEXT UNIQUE,
  downloads_count TEXT,
  category TEXT,
  seeders INTEGER,
  leechers INTEGER
);

.mode csv
.separator "|"
.import torrents_mini.csv torrents

Works like a charm, there must be some equivalent for other databases too.

milezzz commented 9 years ago

Awesome work @rakoo ! Going to test soon but I am away from my desk atm. one thing that really irks me is there is no timestamp in this data making sorting by new entries impossible. That really limits the usefulness of this data sadly.

hutchy1990 commented 9 years ago

i have this working will upload the db soon

nexusrain commented 9 years ago

I stuck getting the csv into mysql. I created a database and a table using the included schema file (protected/data/), and then ran

load data infile '(path to csv)' into table torrents fields terminated by '|' enclosed by '"' lines terminated by '\n';

But all I get is an error, that it tries to insert a value somewhere, where already is one. Can't tell you the code right now, I'm on my phone and going to sleep in some minutes :p

However, would be really appreciated if someone could upload the db as SQL dump.

@rakoo Didn't work for me :( installed sqlite3, created a db, a table like you did and the commands you used but I get, that sqlite expected 7 columns but found 1 in the csv (what..). No idea what's wrong with that. Also tried with sqlite (instead of ...3) but that doesn't like the ".mode csv"..

EDIT: Alright, I think I got the import running now (its still working and will approx. still take some time, this is a very slow server, only for testing purposes). If everythings's working, I'll dump it as SQL and upload the torrent file. Do you like that?

EDIT 2: Will take longer as I thought but I'm on it.

EDIT 3: I think I finally got it ;)

nexusrain commented 9 years ago

Well.. It didn't work as good as I thought, sadly ^^ I made a little sql file which combines multiple things. The included schema file, the first post in this thread from @NewEraCracker and a bit from me. You can take a look at it here: http://content.bitforce.io/obimport.sql.txt Then I created a new db and tried to import the sql into mysql.

It quite takes some time, so you could think, it's working.. But after some time, there's just an "Error 1062 (2300) at line 36: Duplicate entry '0' for key 'hash'"

Fantastic, huh.. Would be great if someone could help me cauz as soon as I got it working, I'll be added to my guide which is linked here: https://github.com/isohuntto/openbay/issues/74

The person/s who helped me getting it working will be mentioned in the guide.

hutchy1990 commented 9 years ago

do not add these lines in

enclosed by '"' lines terminated by '\n';

make sure the path is not a url so like /home/your_upload_path

nexusrain commented 9 years ago

Thanks for your answer, @hutchy1990 but sadly it didn't change much :( Here's the new SQL script without these two lines and the csv file in /var/lib/mysql/(db): http://content.bitforce.io/obimport2.sql.txt

This Screenshot shows what I did and the error then: http://content.bitforce.io/obimport1.png

And this one when trying to import the new sql script: http://content.bitforce.io/obimport2.png

Any ideas? :/

hutchy1990 commented 9 years ago

Yeah I know what's up I had to make a fake torrent table with this code

CREATE TABLE torrent ( name TEXT, size DATE, hash TEXT UNIQUE, downloads_count TEXT, category TEXT, seeders INTEGER, leechers INTEGER );

Use the code to import the csv file once that's done use the real torrents table to add all the same feilds and make sure they have the same name as torrents then delete torrents and rename torrent to torrents

aaruni96 commented 9 years ago

For the whole dump, if I simply let it go using the commands in post 2 of this thread, I get the error.

screenshot from 2014-12-26 06 03 13

But doing the same for a csv file with only the first 28 entries works.

nexusrain commented 9 years ago

@aaruni Had the same problem. Pls check my new thread for the db as mo.fu. SQL dump!

NewEraCracker commented 9 years ago

It's very easy to get the full database dump.

IsoHunt publishes the default details to access openbay content: https://isohunt.to/openbay/config.json

So downloading the database is very easy:

mysqldump -u openpiratebay -p"046b3SVJN9TlG5X58z4lMTNxI" opb -h 46.28.69.238 --port=443 --single-transaction > opb.sql

As I speak, 1 hour has passed and it is 2 GB and growing.

nexusrain commented 9 years ago

@NewEraCracker u fcking serious.. Do you know what I did to get it nearly working? I wrote an SQL script with dumps the csv (to be more detailed, I wrote 4 scripts, the last worked..but on CentOS only ^^) and then you had to install OpenBay first and then do the dump into the db you specified during the installation of OB. If you imported it first, OB just deleted the hole table and imported the schema file. This was a lot of fun. However, thank you, I will try this now too. I already mentioned you in my german guide (cauz I updated this one first), will do the same in the english guide too. :)

P.S.: Oh, and my SQL dump had "only" 1,2GB. However, I'm "downloading" the database now too. Worked fine (but you dont need the " around the password, just put the password directly after the -p switch)!

Edit: finished already some time ago (did it with a dedicated with >100MBit guaranteed connection) but the size of my sql dump is 3,58GB now and its md5 is e06725657f79d98073c446333ff33c77 (funny, so many repeating characters). What about yours?

ghost commented 9 years ago

using this comand, and isnt working: mysqldump -u openpiratebay -p 046b3SVJN9TlG5X58z4lMTNxI -h 46.28.69.238 -P 443 opb > backup.sql

any suggestion?

nexusrain commented 9 years ago

@ekoice How I wrote: NO SPACE after the -p switch and the password! You also forgot the --single-transaction switch (I dont know if it makes a difference but I dont think @NewEraCracker wrote it just for fun).

ghost commented 9 years ago

it worked with this comand, thanks pal ! mysqldump -u openpiratebay -p046b3SVJN9TlG5X58z4lMTNxI opb -h 46.28.69.238 --port=443 --single-transaction > opb.sql

NewEraCracker commented 9 years ago

@nexusrain The md5 hash depends of the number of records and the contents of the header which contain the id of the MySQL/MariaDB version used to dump contents. Your size is about the same as mine. Using gzip -9 I can compress it to approximately 1.5 GB.

--single-transaction is to prevent it from running LOCK TABLES, else it would fail because the read-only account does not have privileges for that.

Regards, NewEraCracker

nexusrain commented 9 years ago

@ekoice pal? @NewEraCracker Yea I know whats a hash :) Or with 7zip I got my dump from the csv from 1,2GB to not even 400MB.

Oh alright, thanks for the explanation! Cheers

ghost commented 9 years ago

@nexusrain mate !!! :+1:

nexusrain commented 9 years ago

@ekoice Oh, sure. Youre welcome :)

nexusrain commented 9 years ago

Alright guys, I finally updated my guide with how to host the database on your own server! Here's the link again: https://www.bitforce.io/hacking/howto-setup-openbay-on-debian-ubuntu-with-nginx-1-6-2/