ratt-ru / montblanc

GPU-accelerated RIME implementations. An offshoot of the BIRO projects, and one of the foothills of Mt Exaflop.
Other
10 stars 3 forks source link

Zernike Polynomial Random Inputs Test Case Failing #260

Open sjperkins opened 5 years ago

sjperkins commented 5 years ago

The Zernike polynomial random_inputs test case seems to be failing on the dask-tf-1.4 branch. It seems to be outputting zeros or nans for the random inputs. The test case should be updated so that representative random input produces some reasonable output.

I've made some updates to the test cases mostly to improve formatting and to make the comparison tolerances more consistent across single and double precision. I haven't modified the actual operator code. See the difference here https://github.com/ska-sa/montblanc/compare/326119608b8505569a3587de58f05457767fcb98...1fee29be7fbbb88b119e6f91fd304f27f6416f0e

sjperkins commented 5 years ago

@joshvstaden Could you please take a look? You'll need to check out the latest dask-tf-1.4 branch of montblanc.

JoshVStaden commented 5 years ago

@sjperkins I updated my Montblanc fork to match yours, and I can't seem to install it. It seems to fail at compile time, and I get the following issues:

montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:561:5: error: expected class-name before ‘{’ token Can't roll back montblanc; was not uninstalled montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:607:33: error: ‘DatasetGraphDefBuilder’ has not been declared montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:578:9: error: ‘montblanc::{anonymous}::SimpleMapDatasetOp::Dataset::~Dataset()’ marked ‘override’, but does not override montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:589:32: error: ‘const DataTypeVector& montblanc::{anonymous}::SimpleMapDatasetOp::Dataset::output_dtypes() const’ marked ‘override’, but does not override montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:592:49: error: ‘const std::vector<tensorflow::PartialTensorShape>& montblanc::{anonymous}::SimpleMapDatasetOp::Dataset::output_shapes() const’ marked ‘override’, but does not override montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:599:9: error: ‘std::unique_ptr<tensorflow::data::IteratorBase> montblanc::{anonymous}::SimpleMapDatasetOp::Dataset::MakeIteratorInternal(const string&) const’ marked ‘override’, but does not override montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:606:16: error: ‘tensorflow::Status montblanc::{anonymous}::SimpleMapDatasetOp::Dataset::AsGraphDefInternal(tensorflow::OpKernelContext*, int*, tensorflow::Node**) const’ marked ‘override’, but does not override montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:552:17: error: cannot convert ‘montblanc::{anonymous}::SimpleMapDatasetOp::Dataset*’ to ‘tensorflow::data::DatasetBase*’ in assignment montblanc/impl/rime/tensorflow/rime_ops/simple_map_dataset.cpp:569:19: error: class ‘montblanc::{anonymous}::SimpleMapDatasetOp::Dataset’ does not have any field named ‘GraphDatasetBase’ from /home/joshua/mb/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/lib/core/errors.h:21, /home/joshua/mb/local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/framework/dataset.h:715:38: error: no matching function for call to ‘tensorflow::data::DatasetBaseIterator::DatasetBaseIterator(<brace-enclosed initializer list>)’ error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

These seem to be syntax and declaration issues in general, which is strange because I imagine you aren't having any of these your side.

sjperkins commented 5 years ago

@JoshVStaden The required version of tensorflow changed (to 1.12.0 I believe). Out of interest how are you setting this up and installing this? At the very least you should be running this in a virtual environment and running:

$ pip install ~/code/montblanc

I would expect the above to complain about the tensorflow version.

JoshVStaden commented 5 years ago

@sjperkins I did notice that the required tensorflow version changed, and I managed to install the latest version without any issues. I am running it in a virtual environment, and the command that I run is pip install -e ~/montblanc/

sjperkins commented 5 years ago

Are you in the dask-tf-1.4 branch?

JoshVStaden commented 5 years ago

Yes

sjperkins commented 5 years ago

Ping me on hangouts?

JoshVStaden commented 5 years ago

I'm unfortunately in PE for the day, I'll only be back late this afternoon

JoshVStaden commented 5 years ago

@sjperkins Hi Simon, do you still need to do a Hangouts?