kdahlquist / GRNmap

Gene Regulatory Network modeling and parameter estimation
BSD 3-Clause "New" or "Revised" License
4 stars 3 forks source link

Coding team tasks for the rest of spring, 2017 #314

Closed kdahlquist closed 7 years ago

kdahlquist commented 7 years ago

Tasks for @trixr4kdz, @cazinge, and @jtorre39:

Tasks for @kdahlquist:

azinge commented 7 years ago

Update for week:

dondi commented 7 years ago

Immediate priority is #301: remove strikeouts and make a final reading before submitting.

kdahlquist commented 7 years ago

I just want to record that MSE is not the same as LSE as discussed in the meeting.

It would be a good idea to review the Dahlquist et al. 2015 paper to understand the math behind the LSE.

trixr4kdz commented 7 years ago

For this week's session, we determined which variables need to be tested to ensure that the LSE function works as intended. From this, I've created templates for testing the lse and gLSE routines so that we only need to worry about figuring out what the expected values are for those routines.

kdahlquist commented 7 years ago

Task list has been updated for this week; see issue for updates.

trixr4kdz commented 7 years ago

For creating the LSE tests, see the comments on #313.

Tl;dr: Testing the LSE function would require assuming that the current output is correct so that when we implement the compressMissingData function, the current output should not be changed. However, what this would mean is probably generating a new set of output for different use cases just like the "sixteen_tests" excel files, which are already different from the current output of the code. @cazinge suggested working in parallel in implementing the new data structure.

For the other issues:

kdahlquist commented 7 years ago

Task list remains unchanged. I noted at the meeting if you are planning to take off early next Friday for Spring Break, then you need to arrange to do your research hours earlier in the week.

bengfitzpatrick commented 7 years ago

@cazinge @trixr4kdz @jtorre39 please let me know when you're up-and-running in the UH lab.

im-deepfriedwater commented 7 years ago

@bengfitzpatrick We're set to go!

im-deepfriedwater commented 7 years ago

@bengfitzpatrick stopped by during our session and gave us specifics on how to approach our tests. Here is the result from our discussion. Fitzpatrick's Whiteboard

Young Eddie's Whiteboard

trixr4kdz commented 7 years ago

For this week's meeting, @bengfitzpatrick helped us map out what needed to be in the gLSE tests. More specifically, two of our test cases were taken from:

Note that in both cases, the L = 0.

trixr4kdz commented 7 years ago

The next steps include working on more test cases that iterate through the test cases we already made and working out the answers by hand.

trixr4kdz commented 7 years ago

For reference, this is the equation that we will use for determining what the L would be based on our changes to the test data:

16935544_1822496211346048_1130146500_o

where nData = (# of flasks) x (# of timepoints) x (# genes) x (# of strains)

dondi commented 7 years ago

Main work at the meeting #313, with #310 and #311 to follow after #313.

trixr4kdz commented 7 years ago

I wrote some comments on new tests for issue #313.

I have now started working on #310 on integrating the new data structure since the new tests for gLSE have been merged (PR #339).

trixr4kdz commented 7 years ago

Minor question, how does the SSE calculation change with the new data structure?

kdahlquist commented 7 years ago

microData will be changed to expressionData with the fields raw, compressed, strain, avg, deletion, t, stdev

microData(index).data in https://github.com/kdahlquist/GRNmap/blob/48da7b9d78a4640a38b20a64220967f489501706/matlab/readInputSheet.m#L61 becomes expressionData(index).raw

expressionData(index).data in https://github.com/kdahlquist/GRNmap/blob/beta/matlab/compressMissingData.m#L4 becomes expressionData(index).compressed

im-deepfriedwater commented 7 years ago

Courtesy of @bengfitzpatrick, a break down of the loop within GLSE 2017-04-05 12 18 48

kdahlquist commented 7 years ago

I just wanted to note that at the meeting today, we advised this course of action:

  1. Make sure the current LSE tests with the current data structure is functional (I believe that @jtorre39 confirmed that this was the case.
  2. Make the modifications to the LSE code.
  3. Run the test with the complete (no missing) data test used on the original LSE code and verify that you get the same results.
  4. Write the additional tests needed to capture all the missing data cases that we previously listed.
trixr4kdz commented 7 years ago

For the points mentioned above by @kdahlquist:

trixr4kdz commented 7 years ago

So I ran the ALL the tests this time, even MSETest (which still takes ~30-ish minutes for just that 1 test unfortunately) and they all pass.

One thing that I observed when I ran the code with the sixteen_tests Excel files, however, is that the precision error we had with timepoint 1.2 came back. As a result, GRNmap will think that the third replicate for timepoint 1.2 is a different timepoint (i.e., we have 3 replicates for t=0.4, 3 reps for t=0.8, 2 reps for t=1.2, 1 rep for t=1.2...0001, and 3 reps for t=1.6). In this case, GRNmap will produce a warning since it thinks that there is only 1 replicate for t=1.2...0001.

kdahlquist commented 7 years ago

Closing out the work of last semester!