Closed kdahlquist closed 6 years ago
Within GRNmap/test_files/matlab_codes/sampleTests there are matlab test files that are useful as examples for beginning to write tests, however we are beyond that level as we already have a whole test suite to look back on for examples. These are subject for removal.
GRNmap\test_files\matlab_codes\calculationTests\newLSETests\GeneralLSETest.m
Is a remnant from last semester as we tried to work on it in tandem. It is no longer needed as a correct and completed version exists at GRNmap\test_files\matlab_codes\calculationTests\GeneralLSETest.m
Directories left for inspection for keeping
@kdahlquist is there a heuristic I could use for checking which test files are no longer needed?
folders of test functions in matlab_codes I am looking through
Test files subject for removal/discussion:
[x] GRNmap/test_files/perturbation_tests/math_post_L-curve_corrected/4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-1_fixP-1_no-graph_test1_LCurve_4_output.xlsx
[x] GRNmap/test_files/lse_tests/LSETest.m
[x] GRNmap/test_files/MSE_tests/dHAP4_15_gene_network_deletion_added_input_KD_20160126_output_with_check.xls.xlsx
[x] GRNmap/initialize_arrays_test/
GRNmap/test_files/perturbation_tests has:
[x] GRNmap/test_files/plots has graph plots that seem to be archival.
We can talk about this list of files at the meeting.
sixteen_tests
should have both inputs and outputs for archival purposes (even though we still need to run the outputs over when bugs are fixed)Also add a README.md at the test_files folder to document the contents for future GRNmappers.
Removed the files as detailed by the discussion of the checklist I posted. Next, is to create an issue to finish the LSETest.m and also finishing checking the rest of the test files I couldn't get to from the previous work sessions.
UPDATE: Issue for LSETest.m created at #376
I've also added a readme.md within the test_files folder. There was already instructions on how to run the test suite within matlab_codes that show up as a readme.txt. I will most likely go in and reformat it as a markdown and leave it in the /test_files/matlab_codes folder and make note of its availability from the readme.md in test_files.
2nd Wave of folders/files subject for removal or are of interest:
With the 2nd wave the first part of the test file audit will have been completed. The next step is to double check that the input workbooks we keep are valid and conform to our new format as denoted in the initial issue.
Round 2 test file audit notes:
data_samples
to the top level in order to separate these from files that are actually used by the test suite.I've removed the appropriate files, what's next is to take care of these left over issues.
Couldn't find an issue related to the data sheet in test_files/perturbation_tests/with_manual_calculations that explained its origins. Removing as of now.
Spent quite a bit of time crawling through commit histories. The test_files/perturbation_tests/with_manual_calculations/readme.txt lists files that I cannot find anywhere and have not seen it while going about 2 years back within the commit history. The readme.txt does reference graphs from the plots folder that we denoted was subject for removal.
Overall, I believe the readme should be removed as after reading through and not finding other files, it does not seem to have a purpose in the repository.
@dondi For the README.md that goes within test_files am I listing one by one the purposes of each folder?
For example I'd write,
etc
@jtorre39 can you point me to where the readme is so that I can take a look at it one last time? Thanks.
@kdahlquist Yes, it is located at GRNmap/test_files/matlab_codes/sampleTests/readme.txt
Test files audit now moves on to verifying that the unculled files comply with the latest input and output sheet formats (https://github.com/kdahlquist/GRNmap/wiki/How-to-format-the-input-file-for-GRNmap-v1.4-and-above, https://github.com/kdahlquist/GRNmap/wiki/How-to-interpret-the-output-file-for-GRNmap).
Brandon has already audited the 16-tests input sheets to that end; just the output sheets remain for those. @kdahlquist will post this information later.
I'm pasting the text of the readme file GRNmap/test_files/matlab_codes/sampleTests/readme.txt here:
The scenario is a four gene network. Two genes are purely self regulated; two others feedback to each other. The data was created by executing two forward simulations, one with all four and one with a single gene deleted. This pair gives us the two data sheets (wt and dcin5).
Input_4_gene_inverse.xls is the input sheet for the estimation run. Input_4_gene_inverse_estimation_output_archived.xls is the output sheet. Name has been changed so that when you run the code, you won't overwrite this file. Input_4_gene_inverse_estimation_output_archived.mat is the output matlab binary file. Name has been changed so that when you run the code, you won't overwrite this file. figure_1, figure_2, and figure_3 are saved output. once again I have added the phrase _archived to the name. figure_4 is generated in the output, but there is no saved .jpg file (bug?).
also included is the file Input_4_gene_testing.xlsx, which was used to generate the forward (model-generated) testing data. In this file, you can see "the right answers" to the estimation problem, in the production_rates and network_weights sheets. fix_b is set to one, so the b's are not estimated in this example.
I still need to look at this and figure out the place where this information needs to reside. In the meantime, since the text is here, you can get rid of the file itself.
Got it!
Directories left for validating input sheet formats:
Many input sheets have been having an extra row or two. These rows were called sheet and deletion in the optimization_parameters sheet. They were not mentioned in the input sheet format guide for 1.4 and above. Additionally, they do not exist in the sixteen_tests sheets so they have been removed from the input sheets I've gone over.
Is there a reason why certain data cells are highlighted within GRNmap\test_files\MSE_tests\dHAP4_15_gene_network_deletion_added_input_KD_20160126.xlsx?
Within optimization_diagnostic_test/optimization_diagnostic_under_100_iterations_test.xlsx in the network_weights sheet the data cells are formatted to numbers instead of general. Should they be changed to general?
It might be worth to note in the wiki if data cells should be formatted to general or numbers for additional clarification.
sheet
row can be deleteddeletion
has been renamed to strain
so if a strain
row already exists, deletion
can be removedOn the issue of highlighted cells, these are remnants of pre-missing-data workarounds where missing cells were populated with average values then highlighted so that users would know which cells were missing.
Nothing needs to be done with the files themselves; however a note in the new README explaining this will help retain this information for the future.
Took some time but I checked each of the input sheets! Anything with an extra row has been corrected. I didn't catch any other oddity than the ones mentioned above.
That is great! Thanks for doing this!
I closed a duplicate issue #177. In that issue, @juansc and @trixr4kdz created a wiki page with documentation of the unit tests. We should revisit this wiki and update.
We should look into writing a script that would automatically generate testing documentation as has been discussed for the GRNsight team.
This issue can finally be closed. Will write the testing documentation task(s) up as a new issue.
We need to perform an audit of the "test_files" directory.
For any files we keep, we need to check that
Assigning this a 0.5 priority because other issues have priority right now. However, when we tackle this, we want to be thorough and detail-oriented so that we don't have to do it again moving forward. Paying off some more technical debt.