Closed ZivIzhar closed 7 years ago
We ramped up on mysql and xml to mysql conversion. We tried converting small xml to mysql. But right now we lack the resources to continue.
We filtered out the irrelevant fields, but we need to make final decisions about them with the rest of the team.
No progress for 9 days????? @idabran @zvili
We had problems processing the data on our computers and we wait @AdiOmari will give us access to some server.
Delays of this sort should be reflected in issues. In many cases, it is possible to use the team resources, people while they wait. But, if the team does not report it is waiting, the impression is that they are just doing nothing. If all you did was waiting this is NOT GOOD. If you did something else, and did not report it, this is also NOT GOOD, but it can be fixed with later reports.
Relevant fields from the data were decided.
The fields we chose to take from the xml are: Id, PostTypeId, ParentId, AcceptedAnswerId, Score, Body, Title,Tags, AnswerCount.
We wrote the SQL code that gets these fields. working on java code that will call our sql command .
Any related commits?
Soon, we are having problems with connecting to our local mysql server.
Regarding the MySQL server on the csl server, In order to not affect adi's databases, we can use another instance as shown in here: http://dev.mysql.com/doc/refman/5.7/en/multiple-windows-command-line-servers.html
@AdiOmari where should we save the posts.xml? I guess we shouldn't add it to github(50gb), but than the code of converting it to mysql will be broken on computers other than the server.
Converting the Posts.xml of so to mysql(the big db and not the demo we used so far) to mysql. The process will probably take few hours.
Successfully imported the xml to mysql server. The access to the server is via localhost:3306
@ZivIzhar @tonylekhtman ".executeQuery("SELECT * FROM so_posts WHERE Id < " + (i + 10000)+" AND Id > "+i);" You should use SQL "LIMIT" command instead: link.
I tried using it but it returned same rows for different ranges.(you also need to add ORDER BY(Id)) It works in the way I wrote it and I almost finished importing. I can search why it doesn't work this way (with the limit) but the DB is already ready.
@tonylekhtman if it is the import code then fine, just make sure your query function (used by yonatan and roded) uses LIMIT. (Limit works like this limit START_INDEX,NUMBER_OF_RAWS_NEEDED).
OK
StackOverflow is downloadable in a xml format from here: https://archive.org/details/stackexchange In order to use it, we need to reformat it to MYSQL.