kite-sdk / kite

Kite SDK
http://kitesdk.org/docs/current/
Apache License 2.0
394 stars 263 forks source link

CDK-976: DatasetKeyInputFormat/DatasetKeyOutputFormat setting job con… #368

Closed EdwardSkoviak closed 9 years ago

EdwardSkoviak commented 9 years ago

…figuration before loading dataset

rdblue commented 9 years ago

Thanks @EdwardSkoviak! I just have a couple of comments about where the configuration gets set. Thank you for taking the time to fix this!

rdblue commented 9 years ago

The Travis build appears to be failing because it can't find DefaultConfiguration. Maybe you're missing an import?

EdwardSkoviak commented 9 years ago

The Travis build appears to be failing because it can't find DefaultConfiguration. Maybe you're missing an import?

Yea sorry, I just snagged the line I added from our forked code and popped it in. The changes you propose above make sense, I will go ahead and make them.

EdwardSkoviak commented 9 years ago

@rdblue I made the proposed changes, let me know if it looks good and I'll squash it into one commit.

rdblue commented 9 years ago

@EdwardSkoviak, what testing did you do? Would you mind making sure these changes fix your problem if you haven't already?

EdwardSkoviak commented 9 years ago

@rdblue The original commit is a change that we've tested and verified that it solved the issue. I'll reverify with these changes but currently I've got a lot on my plate. I'll try to squeeze in this evening or tomorrow morning.

rdblue commented 9 years ago

Thanks! I really appreciate it.

EdwardSkoviak commented 9 years ago

@rdblue I'm going to test this out shortly, sorry for the delay.

rdblue commented 9 years ago

No problem at all, thanks for taking the time to help!

EdwardSkoviak commented 9 years ago

So with the test scenarios I've been running it looks like based on the above changes the fix no longer works. Getting the following[1] which prompted the initial change set. I will try to look at this more this weekend if i get a chance.

[1] https://gist.github.com/EdwardSkoviak/0b3ed6971fd4e4c51624

rdblue commented 9 years ago

@EdwardSkoviak, thanks for the trace. Looks like we call load in getOutputCommitter because we need to check which committer to use. That means we need to setup the environment in getOutputCommitter rather than in the committer. So I think the fix is to add it to what is currently line 508, just before calling load. Then revert the changes to both committers. That should do it.

EdwardSkoviak commented 9 years ago

@rdblue Went ahead and made that change. Just tested it out, works good. Thanks for the help.

rdblue commented 9 years ago

@EdwardSkoviak, I'm closing this because I had to make changes to get tests to pass. I'll monitor the tests and get this in as #371. Thanks for your contribution!