bigmlcom / sensenet

0 stars 8 forks source link

Reading and Tree Op Refactor #16

Closed charleslparker closed 3 years ago

charleslparker commented 3 years ago

While integrating with wintermute, some performance things came up. Specifically, the custom tree operator is apparently really expensive for tensorflow to serialize, so when there are dozens of them, serializing the network was taking half an hour. I've basically folded multiple trees into the tree operator, so now it's basically an "ensemble operator". This reduced serialization from 1800 seconds to 10 seconds.

(As an aside, this isn't the first tensorflow performance problem I've had. It's mostly a nice thing, but lurking everywhere are scary inefficiencies that you can uncover by accident if your computation graph gets too big. When you generate them programmatically, this can easily happen by accident).

I've also refactored reading here so it's more a part of the client's responsibility; the image reading code in tensorflow was just not extensible or flexible enough to handle most use cases. The model wrappers (or client code for other bindings) will now take care of reading the image so tensorflow itself only deals with arrays of pixels.

charleslparker commented 3 years ago

@unmonoqueteclea - Unfortunately, this meant I had to add a dependency on PIL to read the images. It's common enough though that I don't think it will cause major problems.

unmonoqueteclea commented 3 years ago

Does the interface change? I understand that I do not need to use, directly, the helper functions that you included to load images because you are already taking care of that in the model wrapper.

As exported models already worked with pixel inputs, there won't be any change in that interface either.

Should I remove input_image_format or any other parameter from model settings or they are still being used by the wrapper?

unmonoqueteclea commented 3 years ago

No problem with the PIL dependency, I was already using it in many projects

charleslparker commented 3 years ago

Should I remove input_image_format or any other parameter from model settings or they are still being used by the wrapper?

You can remove that and image_file_prefix if you were using that. And in fact, I think it will throw an exception if you do not remove them. If this is hard I can leave them in and they will have no effect.

charleslparker commented 3 years ago

If there's nothing else besides the tensorflow-gpu thing, could you accept this PR @unmonoqueteclea ? I'd like to have that latter in a separate PR.