jackdreilly / SparkADMM

ADMM implementation on Spark Cluster
7 stars 2 forks source link

RCV1Data #8

Closed JojoG closed 12 years ago

JojoG commented 12 years ago

Hey Jack,

I am still working on my code LocalLogR. So I think I found what happen. I think, but maybe I am wrong, there is a problem in the RCV1Data code. Indeed when you split the set into two parts, it seems that the two matrices A are not of the same kind anymore. I mean, try in wy code:

val ATest = trainSet._1 println("ATest") println(ATest)

and you will get the size of the matrix AND all the non zero values.

Now try:

val ATest = testSet._1 println("ATest") println(ATest)

and you get ONLY the size of the matrix. It seems like we cannot access the values, I do not know why...

If you have any idea, let me know^^ I'll try to have a look to the RCV1Data code at the meantime.

jackdreilly commented 12 years ago

verified the problem, will look into it as well, good find!

JojoG commented 12 years ago

Ok I fixed RCV1Data code. Actually it was just a small mistake when you wrote the docIndex. You can pull the new version on the johann branch. You do not have to download the testRCV1Data file, I just used it to find the error.

jackdreilly commented 12 years ago

fixed by commit 29fe5bce817adbee8fc9248778e9f5669e57774f