Open jonas-hag opened 2 years ago
Thanks for your interest in SUMO.
The objective function used by SUMO for factorization is:
SUMO aims to decompose the various adjacency matrices into a common $H$ matrix and a datatype specific $S_i$ matrix. The objective function has two components, the first one calculates the sum of error in decomposition of the adjacency matrices, and the second one ensures sparsity of the decomposed $H$ matrix. The $H$ matrix is used for cluster assignment, so we want it to be sparse. The two values in the square brackets are the two components of the objective function. We can certainly add more information about this in the log file.
As for the log file for run
, you can use the -logfile
option to specify that the logs be written to that file. Are you saying that the option does not work as specified?
Sorry for coming back to you so late. Thank you very much for the explanation, I think it would be great if this is added to the log file!
For the -logfile
option: I don't mean that it doesn't work, but maybe that it could be worthwhile to change the default from (only) printing to stdout to generating a log file by default. So far, by default in every folder for a number of clusters a .log
file is generated for every eta
value. I think it would make sense to also include the information printed to stdout separated by number of clusters into a log file for every number of clusters and include these log files into the folders.
Thank you very much for this helpful tool! I have a few questions/suggestions regarding the logfile:
What exactly does the value in the square brackets behind the ℒ/Δℒ ratio mean? Here is an example:
It would be great if you could add some explanation in the logfile. Apologies if I've missed the explanation somewhere in the documentation.
I would suggest changing the default that the logfile of
run
is not only printed tostdout
but also saved in the directory where the results are saved. Then you have all information in one place. I use SUMO on a cluster where thestdout
is handled by the scheduler and at first I didn't look in thestdout
file generated by the scheduler but searched in the directory created by SUMO and was confused I couldn't find the information.What do you think? I'm happy to help out with a PR if I know what the information mean :)