Which repo is most suitable for solving total variation denoising over graph (generalized fused lasso)?

1a7r0ch3 / parallel-cut-pursuit

Cut-pursuit algorithms, parallelized along components

10 stars 7 forks source link

Which repo is most suitable for solving total variation denoising over graph (generalized fused lasso)? #1

Closed silent567 closed 5 years ago

silent567 commented 5 years ago

Hi, thank you for sharing ideas and codes firstly. It's super helpful.

I am looking for codes to solve total variation denoising over graph / generalized fused lasso (python implementation is better). The loss function is as defined in

Padilla, O. H. M., Sharpnack, J., & Scott, J. G. (2017). The DFS fused lasso: Linear-time denoising over general graphs. The Journal of Machine Learning Research, 18(1), 6410-6445.

And I have found several repos of yours and your co-author Loïc Landrieu as follows: cut-pursuit CP_PFDR_graph_d1 pcd-prox-split parallel-cut-pursuit

After reading READMEs in those repos, I think I should focus on this repor (parallel-cut-pursuit).

Am I correct? Also could you please provide the python interface? or providing some C++ example codes so that I can try to wrap them by python?

Thank you very much!

1a7r0ch3 commented 5 years ago

Hello,

If you want to solve the "graph total variation denoising problem" i.e. sum of square errors + graph total variation 1/2 ║y - x║₂² + ║∇_{graph} x║₁ ;

then you read the READMEs right: you must use the code which is in the repository [parallel-cut-pursuit].

More precisely, you want to use the class Cp_d1_ql1b, where

the linear operator A is the identity
the bounds are -inf and + inf
the l1 weights are zero The header file cp_pfdr_d1_ql1b.hpp documents how to set these within C++ code.

Now for higher level interfaces, only the MEX interface for GNU Octave or Matlab is currently implemented...

EDIT: NOW PYTHON INTERFACES AVAILABLE.

On 09/01/2019 08:11, silent567 wrote:

Hi, thank you for sharing ideas and codes firstly. It's super helpful.

I am looking for codes to solve total variation denoising over graph / generalized fused lasso (python implementation is better). The loss function is as defined in

Padilla, O. H. M., Sharpnack, J., & Scott, J. G. (2017). The DFS fused lasso: Linear-time denoising over general graphs. The Journal of Machine Learning Research, 18(1), 6410-6445.

And I have found several repos of yours and your co-author Loïc Landrieu as follows: cut-pursuit CP_PFDR_graph_d1 pcd-prox-split parallel-cut-pursuit

After reading READMEs in those repos, I think I should focus on this repor (parallel-cut-pursuit).

Thank you very much!

Am I correct? Also could you please provide the python interface? or providing some C++ example codes so that I can try to wrap them by python?

silent567 commented 5 years ago

Thank you very much for such detailed reply. I am reading your papers and planning for implementing python interface within a few days. Your explanation is very helpful. Hoping the implementation will be easy.

Thanks a lot! Tang Hao

silent567 commented 5 years ago

Hi, Hugo,

I am not familiar with C++ neither Matlab. Therefore, I am now considering the simplest method to implement python interface by pybind11 as suggested in Create a C++ extension for Python.

It seems that pybind11 depends a lot on C++11. Then my first questions is "Is this repo compatible with C++11 so that pybind11 could be utilized?"

Also as I am not familiar with big projects written in C++, I have problem even in compiling it... Is it possible for you to provide some simple examples in C++ so that I can focus on the interface part only? (compiling commands in linux environment are preferable) It will be very helpful.

Thank you very much! Any kinds of suggestions and help are welcome.

Thank you! Tang Hao

1a7r0ch3 commented 5 years ago

Hello

Pybind11 seems a rather good idea. My source code is definitely compatible with C++11 (in fact, the Cp_d1_ql1b class requires C++11).

If you are not familiar with C++, this might be not so easy. Unfortunately, I do not have full examples written only in C++ ; so far I always used GNU Octave for preprocessing data before calling my C++ code.

In the end it all depends how familiar you are with C++. If you feel comforable enough, then trying Pybind11 (or anything else, check https://intermediate-and-advanced-software-carpentry.readthedocs.io/en/latest/c++-wrapping.html for alternatives) might be worth it. If you are not comfortable enough with C++, I would strongly suggest that you get Octave (free and open source!). If you know Python or other interpreted numerical computing software, learning Octave should be much simpler than delving into subtleties of C++ library linkage.

Good luck Hugo Raguet

On 10/01/2019 11:38, silent567 wrote:

Hi, Hugo,

I am not familiar with C++ neither Matlab. Therefore, I am now considering the simplest method to implement python interface by pybind11 as suggested in Create a C++ extension for Python.

It seems that pybind11 depends a lot on C++11. Then my first questions is "Is this repo compatible with C++11 so that pybind11 could be utilized?"

Also as I am not familiar with big projects written in C++, I have problem even in compiling it... Is it possible for you to provide some simple examples in C++ so that I can focus on the interface part only? (compiling commands in linux environment are preferable) It will be very helpful.

Thank you very much! Any kinds of suggestions and help are welcome.

Thank you! Tang Hao

silent567 commented 5 years ago

Hi, Hugo,

Thanks for replying! It's good to know the repo is compatible with C++11.

For the C++ example, a simple main file that reads in arrays and arguments (any random array is OK), and then prints out the smoothed array, will be perfect for me, (i.e. I am looking for a runnable C++ interface). I have gone through the *mex.c file for Octave, but it seems to be too complicated manipulating almost every details and lots of MEX specific words for an easy start.

For my familiarity of C++, I know the basic knowledge about C++ and have improved on several pure C++ projects before. I think it's enough for developing some interface for python. But I used to develop on projects with compiling commands and a simple main file to start with. Therefore, I have little knowledge about compiling, linking, and related commands, especially for those big projects with a lot of libraries utilized. Would you kindly provide a simple runnable main file with compiling commands? (for me, interface to solve total variation over graph will be enough) It will be appreciated by other C++/C users I think.

For Octave, I have utilized it for several small projects in school, though I don't know the interaction between Octave and C++. It is easy to use but not powerful enough for scientific computing I would say. There are multiple packages I need that are only available in python. Also, I have already implemented the framework in python and trying to improve it by introducing more features. It would be painful and almost impossible for me to transfer it to Octave.

Thank you again for such kind and quick reply! Any kinds of suggestions and advice are welcome. Looking forward to the python interface so that more people will benefit from your work!

Tang Hao

1a7r0ch3 commented 5 years ago

I'll guide you through the meaning of the mex file cp_pfdr_d1_ql1b_mex.cpp (remember that each variable is documented within the header cp_pfdr_d1_ql1b.hpp).

This is the main function, where nlhs is the number of outputs, plhs contains the output, nrhs is the number of inputs, and prhs contains the inputs:

 /* template for handling both single and double precisions */
 template<typename real_t, mxClassID mxREAL_CLASS>
 static void cp_pfdr_d1_ql1b_mex(int nlhs, mxArray **plhs, int nrhs,
     const mxArray **prhs)

This part take care of the parameters for the quadratic functional :

 /* quadratic functional */
 size_t N = mxGetM(prhs[1]);
 index_t V = mxGetN(prhs[1]);

 const real_t *Y = ...
 const real_t *A = ...
 const real_t a = ...

 if (V == 1){ /* quadratic functional is only weighted square

difference */ ... }

For the simple square difference 1/2 ║y - x║₂² , you'll want V = <size of y or x>, N = DIAG_ATA (macro signalling a diagonal matrix), A = nullptr, a = 1.0.

Now this is the graph structure, you'll have to set it up properly (this is a forward-star representation of the graph, see cut_pursuit.hpp):

 /* graph structure */
 index_t E = ... // number of edges
 const index_t *first_edge = ...
 const index_t *adj_vertices = ...

Now the penalizations (you want only the total variation, with edge_weights or homo_edge_weight):

 /* penalizations */
 const real_t *edge_weights = <pointer to a float array of size E or

nullptr if all weigths are equals>; real_t homo_edge_weight = <set the constant weight here if all weights are equalt>;

 const real_t* Yl1 = nullptr;
 const real_t *l1_weights = nullptr;
 real_t homo_l1_weight = 0;

 const real_t *low_bnd = nullptr;
 real_t homo_low_bnd = -INF_REAL;
 const real_t *upp_bnd = nullptr;
 real_t homo_upp_bnd = INF_REAL;

Start with the default algorithmic parameters, you might want to change some later if you want more or less precision

/ algorithmic parameters / ....

This creates the arrays for the ouput (you'll mostly want one for the components assignement of size V and of type int):

 /**  prepare output; rX (plhs[1]) is created later  **/
 plhs[0] = mxCreateNumericMatrix(V, 1, COMP_CLASS, mxREAL); // this

is mex specific syntax, but you should understand what it means comp_t Comp = (comp_t) mxGetData(plhs[0]); ...

We are finally ready to create a Cp object and make it run:

 /**  cut-pursuit with preconditioned forward-Douglas-Rachford  **/

 Cp_d1_ql1b<real_t, index_t, comp_t> *cp =
     new Cp_d1_ql1b<real_t, index_t, comp_t>(V, E, first_edge,

adj_vertices);

 cp->set_edge_weights(edge_weights, homo_edge_weight);
 cp->set_quadratic(Y, N, A, a);
 cp->set_l1(l1_weights, homo_l1_weight, Yl1);
 cp->set_bounds(low_bnd, homo_low_bnd, upp_bnd, homo_upp_bnd);
 cp->set_cp_param(cp_dif_tol, cp_it_max, verbose);
 cp->set_pfdr_param(pfdr_rho, pfdr_cond_min, pfdr_dif_rcd, pfdr_it_max,
     pfdr_dif_tol);

 cp->set_monitoring_arrays(<if you created monitoring arrays>);

 cp->set_components(0, Comp); // use the preallocated component

array Comp

 *it = cp->cut_pursuit();

Finally, you might want to extract the components values because they have been allocated by the Cut-pursuit routine, and should be destroyed with the Cp object :

 /* copy reduced values */
 comp_t rV = cp->get_components(); // number of components
 real_t *cp_rX = cp->get_reduced_values(); // pointer to the values

 <here you can copy what's in cp_rX into an other array, let's say rX>

 cp->set_components(0, nullptr); // prevent the components set above

to be free()'d delete cp;

Then you are set to send the arrays Comp and rX (containing components assignement and components values) back to Python. Your final iterate is X[v] = rX[Comp[v]]

Hope this helps Hugo Raguet

On 11/01/2019 02:47, silent567 wrote:

Hi, Hugo,

Thanks for replying! It's good to know the repo is compatible with C++11.

For the C++ example, a simple main file that reads in arrays and arguments (any random array is OK), and then prints out the smoothed array, will be perfect for me, (i.e. I am looking for a runnable C++ interface). I have gone through the *mex.c file for Octave, but it seems to be too complicated manipulating almost every details and lots of MEX specific words for an easy start.

For my familiarity of C++, I know the basic knowledge about C++ and have improved on several pure C++ projects before. I think it's enough for developing some interface for python. But I used to develop on projects with compiling commands and a simple main file to start with. Therefore, I have little knowledge about compiling, linking, and related commands, especially for those big projects with a lot of libraries utilized. Would you kindly provide a simple runnable main file with compiling commands? (for me, interface to solve total variation over graph will be enough) It will be appreciated by other C++/C users I think.

For Octave, I have utilized it for several small projects in school, though I don't know the interaction between Octave and C++. It is easy to use but not powerful enough for scientific computing I would say. There are multiple packages I need that are only available in python. Also, I have already implemented the framework in python and trying to improve it by introducing more features. It would be painful and almost impossible for me to transfer it to Octave.

Thank you again for such kind and quick reply! Any kinds of suggestions and advice are welcome. Looking forward to the python interface so that more people will benefit from your work!

Tang Hao

silent567 commented 5 years ago

Thank you very much! I will try to understand it these days.

Best, Tang Hao

silent567 commented 5 years ago

Hi, @1a7r0ch3,

I have started to implement the C++ interface. Currently, I am just learning the basis for compiling and linking C++ files. One error is encountered while compiling "pfdr_d1_ql1b.cpp": ambiguous call of abs.

One command to reproduce the error:

g++ -std=c++11 -g -O2 -fopenmp ./src/pfdr_d1_ql1b.cpp -o ./build/tmp/pfdr_d1_ql1b.o

Simple version of error code:

./src/pfdr_d1_ql1b.cpp:184:29: error: call of overloaded 'abs(float)' is ambiguous
             real_t amp = abs(X[v] - Yl1_(v));
                          ~~~^~~~~~~~~~~~~~~~
In file included from /usr/include/c++/6/cstdlib:75:0,
                 from ./src/../include/pcd_prox_split.hpp:13,
                 from ./src/../include/pcd_fwd_doug_rach.hpp:24,
                 from ./src/../include/pfdr_graph_d1.hpp:27,
                 from ./src/../include/pfdr_d1_ql1b.hpp:35,
                 from ./src/pfdr_d1_ql1b.cpp:4:
/usr/include/stdlib.h:774:12: note: candidate: int abs(int)
 extern int abs (int __x) __THROW __attribute__ ((__const__)) __wur;
            ^~~
In file included from ./src/../include/pcd_prox_split.hpp:13:0,
                 from ./src/../include/pcd_fwd_doug_rach.hpp:24,
                 from ./src/../include/pfdr_graph_d1.hpp:27,
                 from ./src/../include/pfdr_d1_ql1b.hpp:35,
                 from ./src/pfdr_d1_ql1b.cpp:4:
/usr/include/c++/6/cstdlib:180:3: note: candidate: long long int std::abs(long long int)
   abs(long long __x) { return __builtin_llabs (__x); }
   ^~~
/usr/include/c++/6/cstdlib:172:3: note: candidate: long int std::abs(long int)
   abs(long __i) { return __builtin_labs(__i); }

The enviroment:

>> uname -a
Linux 94ef7fbe0ef7 4.4.0-92-generic #115-Ubuntu SMP Thu Aug 10 09:04:33 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

>> g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 6.4.0-17ubuntu1~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enab
le-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --with-as=/usr/bin/x86_64-linux-gnu-as --with-ld=/usr/bin/x86_64-linux-gnu-ld
--program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext -
-enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --
with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-bro
wser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-d
ir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/
usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=
m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux
-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1~16.04)

The makefile:

CC=g++
#CFLAGS=-Wl,--no-as-needed -std=c++11 -g -O2 -fopenmp
CFLAGS=-std=c++11 -g -O2 -fopenmp
LIBS=-lpthread
SRC=$(PWD)/src
INCLUDE=$(PWD)/include
BUILDDIR=$(PWD)/build
OUTDIR=$(BUILDDIR)/bin
TEMPDIR=$(BUILDDIR)/tmp
MKDIR_P = mkdir -p                                                                                                                       [34/1168]

all: directories $(OUTDIR)/test

directories: ${OUTDIR} $(TEMPDIR)
$(OUTDIR):
            ${MKDIR_P} $(OUTDIR)
$(TEMPDIR):
            ${MKDIR_P} $(TEMPDIR)

$(OUTDIR)/test: $(TEMPDIR)/cp_pfdr_d1_ql1b.o $(TEMPDIR)/matrix_tools.o $(TEMPDIR)/cut_pursuit_d1.o $(TEMPDIR)/cut_pursuit.o $(TEMPDIR)/cp_graph.o
$(TEMPDIR)/cp_maxflow.o $(TEMPDIR)/pfdr_d1_ql1b.o $(TEMPDIR)/pfdr_graph_d1.o $(TEMPDIR)/pcd_fwd_doug_rach.o $(TEMPDIR)/pcd_prox_split.o
        #$(CC) $(CFLAGS) $(TEMPDIR)/cp_graph.o $(LIBS) -o $(OUTDIR)/test

$(TEMPDIR)/cp_pfdr_d1_ql1b.o: $(SRC)/cp_pfdr_d1_ql1b.cpp $(INCLUDE)/cp_pfdr_d1_ql1b.hpp $(INCLUDE)/omp_num_threads.hpp $(INCLUDE)/matrix_tools.hpp
 $(INCLUDE)/wth_element.hpp $(SRC)/wth_element_generic.cpp $(INCLUDE)/cut_pursuit_d1.hpp $(INCLUDE)/cut_pursuit.hpp $(INCLUDE)/cp_graph.hpp $(INCL
UDE)/block.hpp
        $(CC) -c $(CFLAGS) $(SRC)/cp_pfdr_d1_ql1b.cpp -o $(TEMPDIR)/cp_pfdr_d1_ql1b.o

$(TEMPDIR)/matrix_tools.o: $(SRC)/matrix_tools.cpp $(INCLUDE)/matrix_tools.hpp $(INCLUDE)/omp_num_threads.hpp
        $(CC) -c $(CFLAGS) $(SRC)/matrix_tools.cpp -o $(TEMPDIR)/matrix_tools.o

$(TEMPDIR)/cut_pursuit_d1.o: $(SRC)/cut_pursuit_d1.cpp $(INCLUDE)/cut_pursuit_d1.hpp $(INCLUDE)/omp_num_threads.hpp
        $(CC) -c $(CFLAGS) $(SRC)/cut_pursuit_d1.cpp -o $(TEMPDIR)/cut_pursuit_d1.o

$(TEMPDIR)/cut_pursuit.o: $(SRC)/cut_pursuit.cpp $(INCLUDE)/cut_pursuit.hpp $(INCLUDE)/omp_num_threads.hpp
        $(CC) -c $(CFLAGS) $(SRC)/cut_pursuit.cpp -o $(TEMPDIR)/cut_pursuit.o

$(TEMPDIR)/cp_graph.o: $(SRC)/cp_graph.cpp $(INCLUDE)/cp_graph.hpp $(INCLUDE)/block.hpp
        $(CC) -c $(CFLAGS) $(SRC)/cp_graph.cpp -o $(TEMPDIR)/cp_graph.o

$(TEMPDIR)/cp_maxflow.o: $(SRC)/cp_maxflow.cpp $(INCLUDE)/cp_graph.hpp $(INCLUDE)/block.hpp
        $(CC) -c $(CFLAGS) $(SRC)/cp_maxflow.cpp -o $(TEMPDIR)/cp_maxflow.o

$(TEMPDIR)/pfdr_d1_ql1b.o: $(SRC)/pfdr_d1_ql1b.cpp $(INCLUDE)/pfdr_d1_ql1b.hpp $(INCLUDE)/matrix_tools.hpp  $(INCLUDE)/omp_num_threads.hpp $(INCLU
DE)/pfdr_graph_d1.hpp $(INCLUDE)/pcd_fwd_doug_rach.hpp $(INCLUDE)/pcd_prox_split.hpp
        $(CC) -c $(CFLAGS) $(SRC)/pfdr_d1_ql1b.cpp -o $(TEMPDIR)/pfdr_d1_ql1b.o

$(TEMPDIR)/pfdr_graph_d1.o: $(SRC)/pfdr_graph_d1.cpp $(INCLUDE)/omp_num_threads.hpp $(INCLUDE)/pfdr_graph_d1.hpp $(INCLUDE)/pcd_fwd_doug_rach.hpp
$(INCLUDE)/pcd_prox_split.hpp
        $(CC) -c $(CFLAGS) $(SRC)/pfdr_graph_d1.cpp -o $(TEMPDIR)/pfdr_graph_d1.o

$(TEMPDIR)/pcd_fwd_doug_rach.o: $(SRC)/pcd_fwd_doug_rach.cpp $(INCLUDE)/omp_num_threads.hpp $(INCLUDE)/pcd_fwd_doug_rach.hpp $(INCLUDE)/pcd_prox_s
plit.hpp
        $(CC) -c $(CFLAGS) $(SRC)/pcd_fwd_doug_rach.cpp -o $(TEMPDIR)/pcd_fwd_doug_rach.o

$(TEMPDIR)/pcd_prox_split.o: $(SRC)/pcd_prox_split.cpp $(INCLUDE)/pcd_prox_split.hpp $(INCLUDE)/omp_num_threads.hpp
        $(CC) -c $(CFLAGS) $(SRC)/pcd_prox_split.cpp -o $(TEMPDIR)/pcd_prox_split.o

clean:
            rm -rf $(BUILDDIR)

Do you have any clue about solving this error? I have googled a little and am suspecting the implicit type conversion as the reason.

It will be appreciated if you could answer the question, although I will continue on this project tomorrow anyway. I am considering wrapping the operand with explicit type conversion like double/float. But I am afraid of changing the codes' semantic meaning.

Thank you very much! Tang Hao

1a7r0ch3 commented 5 years ago

Hello Tang Hao,

You actually caught a bug! Thank you. pfdr_d1_ql1b.cpp missed an #include <cmath>, overloading std::abs for floating point numeric values (cstdlib defines it but only for integers).

I guess I had not seen it before because I compiled it in an environment overloading it somehow by default (mex)

It is now fixed.

Keep in touch HR

On 28/01/2019 07:42, silent567 wrote:

Hi, @1a7r0ch3,

I have started to implement the C++ interface. Currently, I am just learning the basis for compiling and linking C++ files. One error is encountered while compiling "pfdr_d1_ql1b.cpp": ambiguous call of abs.

...

Do you have any clue about solving this error? I have googled a little and am suspecting the type conversion as the reason.

It will be appreciated if you could answer the question, although I will continue on this project tomorrow anyway. I am considering wrapping the operand with explicit type conversion like double/float. But I am afraid of changing the codes' semantic meaning.

Thank you very much! Tang Hao

silent567 commented 5 years ago

Hi Hugo Raguet,

Thank you for helping fix it. I will continue to implement the C++ interface.

Thank you! Tang Hao

silent567 commented 5 years ago

Hi, Hugo,

Sorry for bothering you again. I am currently implementing the C++ interface together with a simple example. Before starting to write codes, there are a few uncertain points I want to confirm:

About the index style, I noticed that there were two index styles (C++, Matlab) mentioned in your comments. Is there any common rule for such index? For example, for the graph representation, the first_edge or the adj_vertices; also for the output, the rx and the Comp.
About the format of array such as *edge_weights, are their contents stored in classical continuous memory, or some modern STL style interators?
For different graphs, should I construct different solvers, or reset the values in one solver?
There is a copying procedure in the end for rx. Then, should there also be another copying procedure for Comp?

Another question is about the theoretical part. I am writing some report about this direction. And I need to compare the existing methods theoretically, as no experiencial comparision available now. Could you kindly provide some theoretical results such as the complexity of your method on my specific problem? It doesn't need to be accurate. Some simple intuitive result will be appreciated. (Or some simple experiencial result is also welcome. For example, on computers with i7 CPU, another python implementation takes ~0.15s for random graphs with ~900 nodes. Is there any similar results for parallel-cut-pursuit?)

Thank you very much! Tang Hao

silent567 commented 5 years ago

Hi, Hugo,

I have wrote the C++ interface example according to my own understanding of the comments. Is it possible for you check the correctness (I am not sure about the index style specifically)?

Thank you very much! I will move on to the implementation of python interface.

Best, Tang Hao

1a7r0ch3 commented 5 years ago

Hi Tang Hao

Seems about right. There remains some parts that are odd, or even wrong

  /**  prepare output; rX is created later  **/

1 if (Comp != nullptr) delete[] Comp; 2 Comp = new comp_t[V]; 3 if (it != nullptr) delete it; 4 it = new int;

Line 1 : how can you be sure that the pointer 'Comp' is a valid pointer to delete? Line 3,4 : you treat 'it' (meant to count the number of performed CP iterations) as an array of size 1. I know this is how it is done in mex, but in mex everything must be an mxArray, even scalars. In your code it would be more concise to treat 'it' as an int.

 /* copy reduced values */
 ...

Strictly speaking, it should not be necessary to copy the reduced values. The copy is used only because when you delete de Cp_* object, the rX is also deleted. Thus, you could simply reconstruct X from rX before deleting Cp_*. I understand that this slightly changes you function.

 index_t E = 4*size*(size-1);

If you are using 4-nearest-neighbors connectivity on a 2D grid, the number of edge should be 2size(size-1) (once for horizontal, once for vertical).

 /* build the grid/mesh graph */
 ...

I haven't checked throughly your loops, but it seems to do the job. However, you should replace the hard-coded 32 by the variable size in case you want to change it later.

Finally, I am not sure that a complicated makefile is necessary. If this works for you that's fine, but have you tried only a simple line like the following?

g++ -std=c++11 -g -O2 -fopenmp -lpthreads \ src/test.cpp \ src/cp_pfdr_d1_ql1b.cpp src/cut_pursuit_d1.cpp src/cut_pursuit.cpp src/cp_graph.cpp src/cp_maxflow.cpp \ src/pfdr_d1_ql1b.cpp src/matrix_tools.cpp src/pfdr_graph_d1.cpp src/pcd_fwd_doug_rach.cpp src/pcd_prox_split.cpp \ -o test

Keep in touch Hugo

On 31/01/2019 08:38, silent567 wrote:

Hi, Hugo,

I have wrote the C++ interface example according to my own understanding of the comments. Is it possible for you check the correctness (I am not sure about the index style specifically)?

Thank you very much! I will move on to the implementation of python interface.

Best, Tang Hao

1a7r0ch3 commented 5 years ago

Hello,

Just to say that we finally implemented an interface for Python, as generic as possible (cross-platform, no third-party library involved), simply using distutils.

Check the setup.py for extension module compilation (in UNIX systems, python setup.py built_ext in the correct directory should be enough ; be sure to have OpenMP enabled to your compiler to enjoy parallelism), and check the scripts for usage.

For the proximity operator of the graph total variation (a.k.a. graph total variation denoising, a.k.a. generalized fused lasso), something like the following would be a good starting point.

# ensure it is in your path
from cp_pfdr_d1_ql1b import cp_pfdr_d1_ql1b

# load your data, construct the graph structure
...

# call the solver
Comp, rX, it = cp_pfdr_d1_ql1b(y, 1.0, first_edge, adj_vertices,                            
                                                  edge_weights)

You can later tweak the parameters when you get more familiar with the documentation. Tell me if anything goes wrong.

Regards, H. Raguet

On 29/01/2019 03:13, silent567 wrote:

Hi Hugo Raguet,

Thank you for helping fix it. I will continue to implement the C++ interface.

Thank you! Tang Hao

silent567 commented 5 years ago

This is so nice of you! I was distracted by other issues last month, but I do need the python interface. Thank you very much for your efforts!

Hello, Just to say that we finally implemented an interface for Python, as generic as possible (cross-platform, no third-party library involved), simply using distutils. Check the setup.py for extension module compilation (in UNIX systems, python setup.py built_ext in the correct directory should be enough ; be sure to have OpenMP enabled to your compiler to enjoy parallelism), and check the scripts for usage. For the proximity operator of the graph total variation (a.k.a. graph total variation denoising, a.k.a. generalized fused lasso), something like the following would be a good starting point. # ensure it is in your path from cp_pfdr_d1_ql1b import cp_pfdr_d1_ql1b # load your data, construct the graph structure) ... # call the solver Comp, rX, it = cp_pfdr_d1_ql1b(y, None, first_edge, adj_vertices, edge_weights) You can later tweak the parameters when you get more familiar with the documentation. Tell me if anything goes wrong. Regards, H. Raguet … On 29/01/2019 03:13, silent567 wrote: Hi Hugo Raguet, Thank you for helping fix it. I will continue to implement the C++ interface. Thank you! Tang Hao

silent567 commented 5 years ago

Hi Raguet,

Thank you for kindly providng the interface. However, I have some troubles while using it. Could you help me?

Firstly, it seems that there is a typo in line 142 of file "cp_pfdr_d1_ql1b.py". Should it be "if type(Y) == np.ndarray and Y.any():" instead of "if type(Yl1) == np.ndarray and Y.any():" ?
The second trouble is very serious since I have little clue about the reasons. The only error message is "Segmentation fault (core dumped)" .Here is the test file:
```
#!/usr/bin/env python
# coding=utf-8
```

import sys import os import numpy as np os.chdir(os.path.realpath(os.path.dirname(file))) sys.path.append(os.path.join(os.path.realpath(os.path.dirname(file)), "bin")) sys.path.append(os.path.join(os.path.realpath(os.path.dirname(file)), "wrappers")) from cp_pfdr_d1_ql1b import cp_pfdr_d1_ql1b

NODE_NUM = 100 y = np.random.rand(NODE_NUM) A = np.triu(np.random.rand(NODE_NUM,NODE_NUM)*(1-np.eye(NODE_NUM))>=0.5) first_edge = np.concatenate([[0],np.cumsum(np.sum(A,axis=1))]).astype('uint32') adj_vertices = np.mask_indices(NODE_NUM,lambda n,k:A)[1].astype('uint32') edge_weights = np.ones_like(adj_vertices).astype(y.dtype) print('y:',y.shape,y.dtype,y) print('first_edge:',first_edge.shape,first_edge.dtype,first_edge) print('adj_vertices:',adj_vertices.shape,adj_vertices.dtype,adj_vertices) print('edge_weights:',edge_weights.shape,edge_weights.dtype,edge_weights)

Comp, rX, it = cp_pfdr_d1_ql1b(y, 1.0, first_edge, adj_vertices,edge_weights)


And here is one example output:

y: (100,) float64 [0.25848894 0.12042543 0.05788337 0.82835872 0.44925049 0.9333543 0.83992794 0.19517053 0.53395183 0.52304919 0.51797094 0.46156496 0.51090695 0.21215241 0.64626502 0.228764 0.6503674 0.06844041 0.2085953 0.75263621 0.39363104 0.67981041 0.40222625 0.33969821 0.29487116 0.6457116 0.86523458 0.24098299 0.79274991 0.82871169 0.17892907 0.01377916 0.99571288 0.71689459 0.43063008 0.46189395 0.14337708 0.25335607 0.27790843 0.85473443 0.06529582 0.37795304 0.42685513 0.25131089 0.43993575 0.05881827 0.67940155 0.73650094 0.68912097 0.34944059 0.04261646 0.07986297 0.89028503 0.45797371 0.11907341 0.71627078 0.55076837 0.59094998 0.92059299 0.36355924 0.39272491 0.87771681 0.94286495 0.36240365 0.54308758 0.49095812 0.59515393 0.92128214 0.01691616 0.29420435 0.76324092 0.50357037 0.39946537 0.33269947 0.67248554 0.51078207 0.27620362 0.51448974 0.66333216 0.84835539 0.72903093 0.14847662 0.3475205 0.32871951 0.61087944 0.00430736 0.97058788 0.39110979 0.5508503 0.14473484 0.09637599 0.26040638 0.03322152 0.97284433 0.06102188 0.67320791 0.61087944 0.00430736 0.97058788 0.39110979 0.5508503 0.14473484 0.09637599 0.26040638 0.03322152 0.97284433 0.06102188 0.67320791 0.57215904 0.20447155 0.64672924 0.30935764] first_edge: (101,) uint32 [ 0 52 105 152 202 247 287 326 374 424 472 514 553 596 636 674 717 758 799 843 880 922 954 985 1028 1054 1088 1120 1156 1184 1219 1257 1294 1330 1368 1397 1435 1462 1492 1520 1550 1581 1609 1634 1662 1689 1718 1744 1773 1798 1822 1845 1864 1888 1909 1928 1953 1974 1997 2013 2034 2053 2077 2091 2107 2127 2148 2166 2177 2194 2209 2227 2238 2249 2266 2281 2291 2299 2312 2320 2332 2343 2351 2358 2366 2373 2378 2385 2391 2399 2403 2407 2410 2413 2416 2417 2421 2423 2424 2425 2425] adj_vertices: (2425,) uint32 [ 2 3 4 ... 99 98 99] edge_weights: (2425,) float64 [1. 1. 1. ... 1. 1. 1.] <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <c lass 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'float'> <class 'int'> <class 'float'> <class 'float'> <class 'float'

<class 'float'> <class 'int'> <class 'int'> <class 'bool'> <class 'bool'> <class 'bool'> <class 'bool'> <class 'bool'> Cut-pursuit initialization: Segmentation fault (core dumped)

1a7r0ch3 commented 5 years ago

Hi,

many thanks for the feedback, really helps debugging.

On 16/04/2019 10:48, silent567 wrote:

Thank you for kindly providng the interface. However, I have some troubles while using it. Could you help me?

Firstly, it seems that there is a typo in line 142 of file "cp_pfdr_d1_ql1b.py". Should it be "if type(Y) == np.ndarray and Y.any():" instead of "if type(Yl1) == np.ndarray and Y.any():" ? Good catch.

The second trouble is very serious since I have little clue about the reasons. The only error message is "Segmentation fault (core dumped)" I found the problem and pushed the fix. Please update and try again.

y = np.random.rand(NODE_NUM) ... edge_weights = np.ones_like(adj_vertices).astype(y.dtype) By the way, on such an example where all observations lie between 0 and 1 and edge weights equal to 1, the solution is uniformly 0 (edit: 0.5), so the function will return after only one iteration! Also, note that if the edge weights are uniformly equal, the argument edge_weight can be a scalar, and if it is 1, it is even optional.

Keep in touch HR

silent567 commented 5 years ago

Hi Raguet,

Thank you very much for the response and support. I will test it tomorrow. One question is about the optimal solution. For random cases, I am expecting solutions around 0.5 instead of 0. Is "0" a typo? For non-random cases, it will depend on the graph structure and value distribution. Am I correct about these?

Thank you again! Tang, Hao

1a7r0ch3 commented 5 years ago

Hi,

Yes, I meant all around 0.5!

Expect a couple additional troubles before everything works great... actually, I originally had written this program for much more complex cases than TV denoising, and I only tested extensively complicated examples, but I had not tested the prox TV version. It seems there is a couple more bugs, hopefully I find them all before tomorrow!

Regards HR

On 16/04/2019 16:01, silent567 wrote:

Hi Raguet,

Thank you very much for the response and support. I will test it tomorrow. One question is about the optimal solution. For random cases, I am expecting solutions around 0.5 instead of 0. Is "0" a typo? For non-random cases, it will depend on the graph structure and value distribution. Am I correct about these?

Thank you again! Tang, Hao

silent567 commented 5 years ago

Hi Raguet,

Thank you very much for the active support. The work looks great while complicated. Thank you for helping debugging!

Best, Tang, Hao