gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 6 forks source link

Test running compiled SINGE on macOS #54

Closed agitter closed 4 years ago

agitter commented 4 years ago

This pull request contains initial work to install and run SINGE on macOS. It uses GitHub Actions to avoid cluttering the Travis CI configuration further.

The main error is that the compiled MATLAB executables are not cross-platform. I added a note to the readme describing this (edd02a6). Running them on macOS gives the error: ./run_SINGE_GLG_Test.sh: line 30: ./SINGE_GLG_Test: cannot execute binary file

The solution would be to compile another version of the executables on macOS and host them alongside the Linux executables. In the meantime, macOS (and Windows) users can run SINGE through MATLAB if they have a MATLAB license or through Docker.

agitter commented 4 years ago

As of d49bfe6, the basic macOS functionality is working. This simplified test runs the compiled SINGE_Test with the MATLAB R2020a runtime. The next steps will be compiling and testing SINGE_GLG_Test and SINGE_Aggregate and updating the SINGE.sh wrapper script to work for Linux or macOS.

agitter commented 4 years ago

I updated the SINGE.sh script to detect the operating system. It checks for macOS and assumes Linux otherwise. When macOS is detected, it adds the _mac suffix to the wrapper scripts for the compiled MATLAB code so that the OS-specific script is used.

This works in our Linux Travis CI test and the macOS GitHub Actions test. You can see in the script output that the correct OS is recognized:

I'll add a little more documentation, and then this will be ready for review.

agitter commented 4 years ago

Currently this test does not compare the edge weights or gene scores against the reference data. It prints the gene scores in the build log for manual inspection.

I can add the test cases before merging to assess any OS-specific behavior. That would be good for us to know about and alert users about.

agitter commented 4 years ago

@atuldeshpande the macOS workflow now runs the script that compares the new generated output with the stored reference output. I also described what tests are run in the readme.

The OS-specific differences are minor:

Comparing SINGE_Gene_Influence.txt
files are identical
Comparing SINGE_Ranked_Edge_List.txt
files are identical
Comparing sparse adjacency matrices
Spare matrices in Output/AdjMatrix_data1_X_SCODE_datapmat_ID_541_lambda_0p01_replicate_1.mat and tests/reference/latest/AdjMatrix_data1_X_SCODE_datapmat_ID_541_lambda_0p01_replicate_1.mat have different values
Maximum absolute difference: 0.0001509642757788754
Spare matrices in Output/AdjMatrix_data1_X_SCODE_datapmat_ID_541_lambda_0p01_replicate_2.mat and tests/reference/latest/AdjMatrix_data1_X_SCODE_datapmat_ID_541_lambda_0p01_replicate_2.mat have different values
Maximum absolute difference: 0.005782771667936615
Spare matrices in Output/AdjMatrix_data1_X_SCODE_datapmat_ID_542_lambda_0p01_replicate_1.mat and tests/reference/latest/AdjMatrix_data1_X_SCODE_datapmat_ID_542_lambda_0p01_replicate_1.mat have different values
Maximum absolute difference: 0.00011594663658021087
Spare matrices in Output/AdjMatrix_data1_X_SCODE_datapmat_ID_542_lambda_0p01_replicate_2.mat and tests/reference/latest/AdjMatrix_data1_X_SCODE_datapmat_ID_542_lambda_0p01_replicate_2.mat have different values
Maximum absolute difference: 0.00014391767285104606

The gene and edge scores match up to 5 significant figures. However, some values in the individual adjacency matrices fail the numpy.allclose test, which has a relative and an absolute tolerance.

Should we relax the test case? Or relax it only for the macOS test but leave it more stringent in Linux?

agitter commented 4 years ago

I generalized the test script so that it can take an optional tolerance argument. The macOS test now uses a more permissive threshold, and the Linux test uses the defaults, which are more strict. All tests are passing.