gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 6 forks source link

Test running SCINGE inside MATLAB Docker container #16

Closed agitter closed 5 years ago

agitter commented 5 years ago

Closes #14

Use https://hub.docker.com/r/fbenz/docker-java-matlab as the base Docker container or an example of how to create a container with the MATLAB runtime.

agitter commented 5 years ago

https://hub.docker.com/r/amarburg/matlab-runtime is a similar container that has MATLAB R2018a (9.4), which matches the version used to compile SCINGE. It also attempts to fix libstdc++ problems, which were encountered in 678337e.

agitter commented 5 years ago

The compiled SCINGE_Example now runs successfully in the MATLAB Docker container. Before merging, we'll need to confirm that it produces the correct output. Currently, it only runs GLG_instance one time. @atuldeshpande is it safe to remove https://github.com/gitter-lab/SCINGE/blob/2609ecf9a24c07c825144325c4c892cdeda27dc4/code/GLG_Instance.m#L89-L91

We can discuss how to create a SCINGE interface that supports command line arguments or a config file.

We'll also need to decide where to permanently host the compiled SCINGE code. Currently it is on a temporary biostat.wisc.edu directory for testing.

agitter commented 5 years ago

e17ac05 now runs GLG the correct number of times and produces the expected output files

-rw-r--r-- 1 root root  7947 Apr 19 21:52 AdjMatrix_data1_X_SCODE_datapmat_ID_541_replicate_1.mat
-rw-r--r-- 1 root root  7905 Apr 19 21:56 AdjMatrix_data1_X_SCODE_datapmat_ID_541_replicate_2.mat
-rw-r--r-- 1 root root  5719 Apr 19 21:55 AdjMatrix_data1_X_SCODE_datapmat_ID_542_replicate_1.mat
-rw-r--r-- 1 root root  5689 Apr 19 21:59 AdjMatrix_data1_X_SCODE_datapmat_ID_542_replicate_2.mat
-rw-r--r-- 1 root root  1246 Apr 19 21:59 SCINGE_Gene_Influence.txt
-rw-r--r-- 1 root root 26389 Apr 19 21:59 SCINGE_Ranked_Edge_List.txt
agitter commented 5 years ago

I ran SCINGE_Example.m in MATLAB R2018a to generate reference output files. The file sizes are small so I stored them all as uncompressed files. The new script compare_example_output.sh tests that the output from the compiled SCINGE_Example run inside the Docker container matches the output files. It uses the csvdiff Python package to compare SCINGE_Gene_Influence.txt and SCINGE_Ranked_Edge_List.txt. This package can ignore differences past a desired number of significant figures.

In order to run the tests in Python, Miniconda is now installed inside the Docker container approximatley following the Miniconda Dockerfile.

Currently, the intermediate .mat files are not tested. They could be if we use Python to import these files or convert them to .csv and compare them with csvdiff. That would require additional Python packages, in which case we should create a conda environment for the testing. What would we want to compare? Only the values in the sparse adjacency matrix?

Running the tests in Python and bash scripts is unusual, but if we write them in MATLAB we would need to compile them to run them in Travis CI. These tests scripts can eventually be used to

agitter commented 5 years ago

The sparse adjacency matrix test now outputs the maximum absolute value of difference if the values are not close. The test cases now run correctly and run inside a conda environment, which will be important for our local testing.

Before the final code review, I plan to further test my test code by modifying some of the sparse matrices and making sure the tests fail.

agitter commented 5 years ago

@atuldeshpande this is ready for review. I will merge after you approve.

I tested that the test cases work, failing Travis CI they should:

I've also done some local testing to confirm that the spare adjacency matrix test only fails when the absolute difference is large enough. It passes for very small differences.