Closed taylorwood closed 8 years ago
I don't think so b/c they match by ID -> but it is worth checking. Easy enough to test.
Is it cool if I upload a submission? I think the score will change because I compared the before/after output and the per-ID relevancies were pretty different.
Go ahead. We have 4 for the day.
Also, you might want to leave the pseq in and add 1 more order clause on the id
You improved on your best score by 0.02779. You just moved up 67 positions on the leaderboard.
I'll look into using PSeq.map
again while preserving the order of the data. I think the reason it matters is because the output is zipped with the CSV input rows, which are always in the "right" order, but the PSeq.map
output isn't.
Sweet!
On Mon, Feb 1, 2016 at 8:58 AM, Taylor Wood notifications@github.com wrote:
You improved on your best score by 0.02779. You just moved up 67 positions on the leaderboard.
I'll look into using PSeq.map again while preserving the order of the data. I think the reason it matters is because the output is zipped with the CSV input rows, which are always in the "right" order, but the PSeq.map output isn't.
— Reply to this email directly or view it on GitHub https://github.com/jamessdixon/Kaggle.HomeDepot/pull/15#issuecomment-177983034 .
PSeq.map
doesn't preserve the order of the input sequence, so the output was "scrambled". I noticed this after usingPSeq.map
to parallelize the output CSV; the IDs were unsorted. I think this might've significantly affected the score.