Error during compilation of a code trying to calculate the jacobian of a function with many function calls

A-K-Mishra commented 1 year ago

Consider the code:

#include "clad/Differentiator/Differentiator.h"
#include <iostream>
#include <vector>

using namespace std;

struct C_Super{
    vector<double> _jumps;
    vector<double> _jumpsValue;
    C_Super(vector<double>& jumps, vector<double>& jumpsValue) : _jumps(jumps), _jumpsValue(jumpsValue) {}
};

// Region I can't control
// starts here
vector<double> jumps({-1, 0.2, 4.5, 9.0, 7.7}),
                jumpsValue({0.1, 0.01, 0.3, 0.1, 0.1});
C_Super* C_obj = new C_Super(jumps, jumpsValue);
// ends here

double calculate_m(C_Super* obj, double t){
    double jumpEffect = 1.0;
    for(int i = 0; i < obj->_jumps.size(); i++){
        if( obj->_jumps[i] >=0 && obj->_jumps[i] <=t) jumpEffect *= obj->_jumpsValue[i];
    }
    return jumpEffect;
}

double calculate(double t){
    // return 0.0;
    return calculate_m(C_obj, t);
}

void npvCalc(double t, double t_end, double * npv){
    npv[0] = 0.4* calculate(t) + 0.6 * calculate(t_end);
}

int main(){
        double t1 = 0.1, t2 = 10.0;
        double npv[1] = {};
        double derivatives[2] = {};

        auto d_npv = clad::jacobian(npvCalc);
        d_npv.dump();
        d_npv.execute(t1,t2,npv, derivatives);
        cout<<npv[0] <<" " << derivatives[0]<<" "<<derivatives[1]<<endl;
}

This code on compilation with the command:

clang-15 -cc1 -x c++ -std=c++11 -load '/content/inst/lib/clad.so' -plugin clad sample.cpp

gives the following Stack dump:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: clang-15 -cc1 -x c++ -std=c++11 -load /content/inst/lib/clad.so -plugin clad sample.cpp
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEi+0x31)[0x7fb42cd8de91]
/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0xee)[0x7fb42cd8bbde]
/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1(+0xf023bb)[0x7fb42cd8e3bb]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fb4368aa420]

parth-07 commented 1 year ago

Hi Aniket,

There are multiple problems involved here, few of them are current limitations of Clad. I'd like to briefly go over the important ones. Right now, I will go over the main issue -- incorrect compilation command used. I will go over other important issues tomorrow. For now, please note that Clad does not support this example as it is. We will need to make changes to the example to make it work.

Compilation command issue

The compilation command used is not quite correct.

Using clang -cc1 is generally difficult to get right because you must manually pass the correct standard library paths and compiler options.

I would suggest you to explicitly pass options to the clang compiler from the clang driver using -Xclang option. The command shown below can be used to compile the program and run Clad:

clang++ -o main.elf main.cpp -Xclang -add-plugin -Xclang clad -Xclang -load -Xclang /path/to/clad.so -I/path/to/clad/include -xc++

A-K-Mishra commented 1 year ago

Hi @parth-07, actually this issue has been raised with the intention of making clad support object oriented programming and other constructs. As far as compilation is concerned, thanks for your reply, I would definitely try it, but I had already tried this code on binder, but the kernel crashes every time. Therefore, I thought the stack-dump that I have got above might be the relevant stack trace. I will try your command and hopefully update the stack-dump.

parth-07 commented 1 year ago

Hi @A-K-Mishra

Yes, currently, Clad does not support your example code. It would currently crash with the example you have provided. I'd like to briefly go over some of the important limitations of Clad in the context of your example.

Reverse mode currently does not support pointers.

It is something pending for a long time. It is not necessarily difficult, but we haven't had a chance to implement this in Clad yet.

But this should not be a blocker for you if you can convert pointers to references in the code that is to be differentiated.

Global variables are assumed to be constant.

This is by design. Functions that are to be differentiated by Clad are expected to be pure. That informally means, the same input should give the same output. Global variables potentially violate this. One effect of considering global variables to be constant is that there is no corresponding derivative variable of global variables. This behaviour can result in incorrect behaviour depending on how global variables are used within the primal function.

This is also not necessarily a blocker if you can pass C_obj as an argument to the functions which are being differentiated. This way, it will no longer be a global variable for the functions which have to be differentiated.

Differentiable class type needs to have a default constructor that represents zero derivative object for the class.

It was the first prototype, we did not get time to improve it. Design can definitely be improved here by making things more explicit. We would always need some way to define zero-derivative objects of a class.

C_Super class has no default constructor. Simply just adding a default constructor will not solve the issue, because derivative needs to have correct sizes for _jumps and _jumpsValue members.

There are other issues as well, for example, differentiating operator+= for vectors. Clad cannot automatically differentiate this as of now. Clad custom derivative functionality can be used to tell Clad how to differentiate vector operator+= operator.

actually this issue has been raised with the intention of making clad support object-oriented programming and other constructs.

If the goal is to make Clad support object-oriented differentiable programming, then I would recommend first supporting all the necessary constructs in the forward and reverse mode AD. We should focus on other modes (hessian, and jacobian) later on. Jacobian mode, in particular, has a few existing bugs which should be fixed first. Please let me know if you want to have a discussion about object-oriented differentiable programming.

parth-07 commented 1 year ago

Hi @A-K-Mishra

I have changed the code presented in the issue such that it works with Clad.

#include "clad/Differentiator/Differentiator.h"
#include <iostream>
#include <vector>

using namespace std;

struct C_Super {
  C_Super() {}
  vector<double> _jumps;
  vector<double> _jumpsValue;
  C_Super(vector<double> &jumps, vector<double> &jumpsValue)
      : _jumps(jumps), _jumpsValue(jumpsValue) {}
};

namespace clad {
namespace custom_derivatives {
void multiply_val_pullback(::std::vector<double> &v, int idx, double val,
                           ::clad::array_ref<::std::vector<double>> d_v,
                           ::clad::array_ref<int> d_idx,
                           ::clad::array_ref<double> d_val) {
  double t = *d_val;
  *d_val = 0;
  (*d_v)[idx] += val * t;
  *d_val += v[idx] * t;
}
} // namespace custom_derivatives
} // namespace clad

void multiply_val(::std::vector<double> &v, int idx, double &val) {
  val *= v[idx];
}

// Region I can't control
// starts here
vector<double> jumps({-1, 0.2, 4.5, 9.0, 7.7}),
    jumpsValue({0.1, 0.01, 0.3, 0.1, 0.1});
C_Super *C_obj = new C_Super(jumps, jumpsValue);
C_Super *d_C_obj = new C_Super();
// ends here

double calculate_m(C_Super &obj, double t) {
  double jumpEffect = 1.0;
  for (int i = 0; i < obj._jumps.size(); i++) {
    if (obj._jumps[i] >= 0 && obj._jumps[i] <= t) {
      multiply_val(obj._jumpsValue, i, jumpEffect);
    }
  }
  return jumpEffect;
}

double calculate(double t, C_Super &c) {
  // return 0.0;
  return calculate_m(c, t);
}

void npvCalc(double t, double t_end, C_Super &c, double &npv) {
  npv = 0.4 * calculate(t, c) + 0.6 * calculate(t_end, c);
}

double npvCalcWrapper(double t, double t_end, C_Super &c, double &npv) {
  npvCalc(t, t_end, c, npv);
  return npv;
}

int main() {
  double t1 = 0.1, t2 = 10.0;
  double npv[1] = {};
  double derivatives[2] = {};

  auto d_npv = clad::gradient(npvCalcWrapper);
  d_npv.dump();
  // d_npv.execute(t1, t2, npv, derivatives);
}

Notable changes are:

1) I have added a default constructor to the C_Super class. Clad currently requires each differentiable class to have a default constructor. The default constructible object is taken as the zero derivative in the mathematical space represented by the class. This restriction will be lifted in the future.

2) I have changed jumpEffect *= obj._jumpsValue.at(i); to:

multiply_val(obj._jumpsValue, i, jumpEffect);

namespace clad {
namespace custom_derivatives {
void multiply_val_pullback(::std::vector<double> &v, int idx, double val,
                           ::clad::array_ref<::std::vector<double>> d_v,
                           ::clad::array_ref<int> d_idx,
                           ::clad::array_ref<double> d_val) {
  double t = *d_val;
  *d_val = 0;
  (*d_v)[idx] += val * t;
  *d_val += v[idx] * t;
}
} // namespace custom_derivatives
} // namespace clad

void multiply_val(::std::vector<double> &v, int idx, double &val) {
  val *= v[idx];
}

We need to add support for the below two features to support automatic differentiation of jumpEffect *= obj._jumpsValue.at(i);:

Differentiation of function returning a reference or a pointer type in the reverse-mode AD.. While this is complicated, the majority of work for this is already done and can be found in this pull request: [.] (https://github.com/vgvassilev/clad/pull/425). The pull request needs some polishing.
Support differentiation of operator overloads in the reverse-mode AD. It should be trivial to add once (1) is added.

3) Instead of using a global C_obj, I have changed the code such that we pass the C_obj as an argument to the functions. 4) I have also changed clad::jacobian to clad::gradient based on our discussion. For this, I have added a wrapper function, npvCalcWrapper.

Apart from this, I suspect the below loop is being differentiated incorrectly:

for (int i = 0; i < obj._jumps.size(); i++) {
  if (obj._jumps[i] >= 0 && obj._jumps[i] <= t) {
    jumpEffect *= obj._jumpsValue[i];
  }
}

As per the differentiated code, jumpEffect seems to be independent of the variable t, but actually that's not the case.

Please let me know if you have any questions.

vgvassilev / clad