Open rainersachs opened 6 years ago
Peter: you have two files "MIXTE". If these should now be deleted please delete them or ask me to. If they should be retained please give them more informative titles and move them to the Lymphocyte subfolders if you can, or perhaps delete them and then add them in the subfolders.
Here are some issues for summer 2018 and perhaps beyond.
There are problems with the files in the GitHub repository rainersachs/URAP.CA. I outline some of them below. Peter please decide, with input from Andy and perhaps Dae as needed, whether to shut down all work of the URAP CA pod till the fall semester starts, or proceed to address the problems gradually during the summer. What I need to proceed with writing the paper is some R (not Rmd) script to make lots of plots, with plot() rather than ggplot2(), to decide on a small minority of those plots that are candidates for the paper. This script needs to be in the rainersachs/URAP.CA repository on GitHub. For the minor paper, NASA guidelines mean we will need .R versions of everything in another repository named after and dedicated to the minor paper and frozen (apart from correcting errors) at the time of publication. So the new repository will refer only to the data we are using now.
URAP.CA will remain as a long run repository with lots of data coming in. In the long run .Rmd files may be best and could be used for URAP.CA but for the minor paper we can't use them
One problem is if the Monte Carlo for 95%CI ribbon plots takes too long. In the very similar data set of the HG pod we now can do a Monte Carl sample within a few minutes. But that seems not to be the case in the files on GitHub repository URAP.CA. Is that because the URAP.CA files fail to use standard functions in standard R packages, and use instead customized functions which have the same functionality but are not as fast? Or has the CA Monte Carlo become much faster since the last time I was able to check? Another problem is that the plotting and Monte Carlo are too intermingled. We certainly do not want to run Monte Carlo anew every time we make a plot for the paper, let alone every time I want to experiment in choosing which plots to use.
Peter had a very nice solution for that, running Monte Carlo once and then recording the outcome in .csv files for use in all plots. It had the wrong 795 entry for a Z^2/beta^2 value. I think after correction we might be able to use that method for the minor paper repository. However while writing the paper we may need more flexibility. A minor problem is that rainersachs/URAP.CA does not seem to be consistent as regards using nls() versus using nlsLM from minpack. I installed minpack.lm so I can work with either. But I still cannot get the files to run. They don't seem to fit together. Almost all of them seem to use nlsLM but in the Graphs subfolder there is a file with the non-informative name "LAPTOP-QH0F5KBF.Rhistory" that uses nls instead of nlsLM.
Peter: if you have questions before deciding whether to shut down completely for the summer let me know, preferably by both email and in URAP.CA depository. No hurry as far as I am concerned, Thanks, Ray
Met with Peter. As time permits he will
(a) aim for a self-contained all R (not .Rmd). all plot() (not ggplot2()) sub-repository suitable for removing from this repository and putting in a private Sachs repository for official NASA use, that requires the following restrictions, with the minor paper. The private Sachs repository will be named after the minor paper directory. The sub-repository will contain a (possibly cleaned up) R version of Andy's basic .Rmd file, with all Monte-Carlo parts commented out. Using nlsLM() from minpack.lm package is OK. The Monte-Carlo part of the private Sachs directory will consist of 2 .csv files that can be used to run either the specific 2-ion 50-50 mixture as before or the rep(1/6, 6) 6 ion mixture as before. the sub-repository will be frozen as of the publication date except that sachs will correct it if errors are found. In addition to andy's basic file the sub-depository will contain at least one example plot of a one-ion DER, and the- plot of the 2 ion mixture with I(d) and S(d), and the corresponding plot for the 6-ion mixture, and the ribbon plot for the narrow 95%CI version of the 2-ion mixture, and the broad (uncorrelated-parameter) 95%CI version of the 6-ion plot.
(b) rearrange this URAP.CA directory in any way Peter and Andy think logical, using .Rmd files as their master version. Delete lymphocytes everywhere and delete obsolescent sub-folder. I just took out the minor files sub-folder
(c) As time permits look into speeding up the Monte-Carlo by a factor of at least 10. This has lower priority than (a) and even though it is more important scientifically and for long-run use.
After the sub-repository has migrated to the new repository this URAP.CA repository should be under student control (e.g. Andy and/or Peter), not Sachs' control. In principle their should be co-owners of the repository: Sachs for long-range continuity and at least one student authorized to take all decisions about what is in URAP.C A where.
I forgot to add that some plot examples could be very rough or even be ggplot2 if plot is too unfamiliar. I think I could translate into plot myself without much trouble and certainly could do so with a little help.
Here is a .pdf on programming and major-paper plans for fall semester 2018
Andy and Peter. The figure above had a mistake. Here is a corrected version of URAP CA as we will set it up and the mouse HG file suite as already implemented
Please figure out from the mouse pod monte carlo or the 2-ion monte carlo in the CA script on gitHub how the monte-carlo calculation can be speeded up by a very large factor, and implement that while you are going over to the new flowchart.
Please let me know what hours are convenient for you for regular meetings this semester.
thanks, ray
The following is here to make sure we don't forget the following numbers and arguments.
The zero-dose data show a background prevalence Y_0 = 0.00071 instead of the larger value 0.0017. Details are in Dae's 2018 Radiation Research paper. Please change the script accordingly. Y_0 is so small that the change will result in negligible changes of any figure except perhaps a figure which zooms in very closely on very small doses. However it will actually strengthen the paper substantially. A main message of the paper will be: this is the best data set of all for seeing whether HZE induce NTE in tumor or tumor surrogate endpoints, a question important for the very low doses and dose rates astronauts encounter in interplanetary space. One reason the data is so suited for that purpose is that the background is so low. Since background and NTE are two competing explanations for the high prevalence observed at the lowest non-zero dose points, even a small decrease in a small Y_0 strengthens the case for NTE disproportionately.
Andy and Peter: Please read the above from the corrected corrected figure down to here.
I have now added to rks_DataAndInfo.R a couple of commented lines which show how to calculate some numbers we will need when updating our input .csv scripts as Hada sends more data about extra ions. These lines should also be added to the CA version of rks_DataAndInfo.R when it is written.
Edward just emailed me the following: "2. I spent today refactoring the rest of the Monte Carlo code in the CA script. The two-ion, no covariance matrix ribbon plot now takes exactly one minute to run and the six-ion with covariance ribbon plot takes about 2.3 minutes to run. The last figure, a six-ion no covariance ribbon plot, had incomplete plotting code so I did not refactor the Monte Carlo there, but I looked over the it closely and found that my previous approach can be easily adapted to it. I attach plots by the old and new code below for comparison. I noticed no distinguishable difference between the two-ion plots and small but perhaps inconsequential differences between the six-ion plots. The new changes are on URAP.CA." I haven't checked, but unless the starting point for the random number generator was held the same, the small changes edward mentions could be just the typical differences between any 2 monte carlo carlo calculations.
He added: "I would advise Andy and Peter to follow the coding, organization, and documentation style in the mouse repository. They can feel free to ask me for assistance if the style in the mouse code is unclear." I agree with this suggestion. I am not sure what the lower fig. above is.
18fa_Cucinotta_NTE_for_private_astronauts.pdf
Here is a very up to date paper that deals with almost all of the modeling issues in our calculations, uses almost all the acronyms with which most of you are by now familiar, and also gives the space-travel background for the modeling. I suggest you read the article as deep background for the projects. Among the terms that we haven't discussed or used much are Relative Biological Effectiveness "RBE" and "Quality Factor". They are not rigorously defined but even as fuzzy concepts they are quite important, so you might want to look them up.
Elementary_picture_of_LET.pdf Ballarini_2008_New_J._Phys._10_075008.pdf Here is a bit I wrote on LET because a couple of you asked about it. The Ballarini .pdf has, in the lower right corner of Fig. 1 and in Fig 2, visual examples of track structure models more sophisticated than the very naive straight line track structure model I used in my ,pdf
Thanks for your revision of Graphs.R yesterday! I commented out the Monte-Carlo parts and that gives me exactly the type of 1-ion graphs I need for diagnosis. Nice job!
One minor quibble is that there is an option to calculate an "average" of two or more IDERs. I suspect that by "average" you mean the following: Define "Peter-pointwise-additivity" as PPA = (1/2)[E_1(d)+E_2(d)], where d is a dose. But what dose? If we are talking about a mixture experiment with total mixture dose 2d then PPA is just simple effect additivity SEA. Otherwise, as far as I can see, PPA gives you a curve which indeed lies between the E_1(d) and E_2(d) curves and will often be a rough approximation to incremental effect additivity (IEA) with r=1/2 but has no relevant interpretation at all, though the curve looks much better than the SEA baseline.
More generally, "adding" curves is not a self-explanatory idea and there are many different ways "adding" can be made precise, not only PPA,. SEA,, and IEA but also many other methods. For example Berenbaum's linear isobole method for monotonic increasing curves uses the inverse functions to E_k to get an inverse function for the "added" curve and typically leads to results slightly different from IEA. I suspect our entire approach might have some simpler and more general formulation if one really worked with function spaces and perhaps with some notion of a convex region in a function space, but maybe that is just a red herring.
message to Peter and testing my notifications. Thanks! An Important type of graph. It caught me with pants down because no notification. I have now arranged that I be notified henceforth. I assume the figure code will work when Andy adds extra data rows? Can I now download the files in the separated folder to my sandbox and add my own graphs? Some comments and questions on the figure follow Oxygen55 pt at CA 12+?? Oxygen350 pt at dose 0?? No SEA to reduce clutter – shoot down earlier Why ribbon? Not really informative until actual mixture data. Needs error bars somehow. Can one make point area proportional to error bar length? Use separate panel with just one one-ion DER and its error bars to help reader translate from error bar length to point area. In any case make points larger choose better color contrasts, e.g. for 6 lines: black, red, green or aquamarine2, bright brown-orange-yellow, dashed black, dashed red. Corresponding points somehow identified, e.g. open for dashed lines, otherwise solid.
automate_aggregating_CA_data3.1.2019.pdf Hi Peter: Here are two R functions that I need before trying out a second panel for your multi-color dots figure. No hurry but I will suspend my paper writing till I get those functions. Thanks! Ray
Hi Peter and Kulunu! Here are dome items for tomorrow's agenda. No hurry on any of this except the 2/19 deadline for URAP summer. Peter:
Kulunu:
See you both tomorrow, I hope.
Hi all: Here is the first attempt at the title and abstract of the major class paper with Hada. You may want to read through it to see how your part of the project fits into the whole and/or make suggestions and corrections. Title_and_abstract_class_paper_v1.pdf
Hi Andy: While working on Hada's six ion mixture P-24 .xlsx (which I will send you by email because it is locked so I cannot make it into a .pdf and GitHub doesn't like .xslx) I found seven more rows for the 1-ion CSV.csv that I am preparing in my sandbox and will push to GitHub as the master 1-ion .csv as soon as I can (so Peter and Kuluhu can work on the paper ASAP). A .pdf of the under-construction 1-ion .csv is added in this comment. Please let me know ASAP if you see any obvious mistakes or duplications in the added cells. There are omissions, which I will fill in later, but I want to catch mistakes before I make further entries, e.g. extra added rows for the 4-ion mixture. If you don't have time, let me know right away. If you do have time let me know right away with an estimate of when you will have an answer (I don't need a full answer yet, just a statement that you see no obvious mistakes or have found what looks like a mistake). I am confused because I found a sheet 3 in P-24 with the extra rows and I can't see where it came from. CSV.pdf
Hi Professor Sachs
I am busy with exams this week and I will take a look at the data on Wednesday and meet with you on Thursday at usual time. I will also help you on finishing 3-sheet xlsx after the spring break.
Sincerely
Andy
Rainer K. Sachs, Professor Emeritus notifications@github.com于2019年3月18日 周一下午2:17写道:
Hi Andy: While working on Hada's six ion mixture P-24 .xlsx (which I will send you by email because it is locked so I cannot make it into a .pdf and GitHub doesn't like .xslx) I found seven more rows for the 1-ion CSV.csv that I am preparing in my sandbox and will push to GitHub as the master 1-ion .csv as soon as I can (so Peter and Kuluhu can work on the paper ASAP). A .pdf of the under-construction 1-ion .csv is added in this comment. Please let me know ASAP if you see any obvious mistakes or duplications in the added cells. There are omissions, which I will fill in later, but I want to catch mistakes before I make further entries, e.g. extra added rows for the 4-ion mixture. If you don't have time, let me know right away. If you do have time let me know right away with an estimate of when you will have an answer (I don't need a full answer yet, just a statement that you see no obvious mistakes or have found what looks like a mistake). I am confused because I found a sheet 3 in P-24 with the extra rows and I can't see where it came from. CSV.pdf https://github.com/rainersachs/URAP.CA/files/2980431/CSV.pdf
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rainersachs/URAP.CA/issues/3#issuecomment-474104932, or mute the thread https://github.com/notifications/unsubscribe-auth/AZBmwit7X1OwT67wz7YaBPZn6518AAcVks5vYAKFgaJpZM4Sv4At .
-- Andy Zhao • 赵力阳
Bachelor of Arts Double Major in Statistics and Applied Mathematics University of California, Berkeley, Class of 2019
Hi! Hada has accepted our plan in principle. Here is her email: 3/25/2019 Hi Ray, It is great that you have been completing 82-6 data. J-218, F-0-i and F-5-i are cells I pretreated with inhibitor. Please omit these data. Situation of our team’s publication is 1) Ianik’s paper with 4 single beams (no shielding) : submitted to RRS on September, got reviewers comments on January, rewrite to answer the reviewers comments and resubmitted in February. 2) Tony’s paper with 2 beam (shielding) : Submitted to RRS on December, got reviewers comments recently, Tony is working on to revise. Please do not wait, go ahead to start preparing the paper. Attached the PDF file of previous presentation. In page 3, list of 82-6 data available. If you need any of those, let me know.
I will need to double check the .pdf Hada refers to above to see that we have all that mixture and related 1-ion data. I will do that this week. Here is the .pdf itself. GCR Consortium #17 Hada_mainly_mixed_beam.pdf Both CA and HG pods are now urged and equipped by experimentalist colleagues to go full speed ahead with paper writing in accordance with the suggestions we made to them.
Hi pod! Hada has now given us full leeway on the big paper. We can write whatever we want, for any journal we want, as fast as we can, as slow as we have to, using whatever we want of any of her data. This comment concerns progress and our upcoming workflow as regards the now top priority PLoS Biology paper.
Coincidentally the mouse pod is in almost exactly the same position: On Monday Blakeley and Chang give us full authority to write a paper on 3 recent mixture experiments. They have deadlines, so I had to plan the mouse pod's work first, where I hopefully now have the pod working by itself so the only thing it needs from me is writing the paper, so I am now returning to the job of entering Hada's data.
Peter: However, while working on the mouse pod data entry yesterday, I found a protocol which I think is much better than our present CA protocols. It starts with an Excel workbook sheet that has all sorts of information in a particularly convenient form. Among other things it is so narrow I see its full width on one screen even when zooming in so close my lousy eyes can read it easily. I attach a .pdf of the nearly completed Excel sheet next so you can see (looking near the bottom) why the format seems so convenient to me. I will send you the Excel version by email so you can manipulate it in your sandbox after whatever name changes you have already implemented.
By deleting lots of rows and columns the worksheet can be turned into a comma separated input file. I will email that to you. Please check if it can be used with our R suite, as is or with minor changes. If so we should continue this approach or an approach which shares its main virtues. If not we have your most recent approach as a fallback option. The new protocol is slightly less automated than your protocol but is close enough to being automatic that we can implement it easily, since we will soon have entered most of the 1-ion rows we will ever use.
Andy, Kuluhu, Peter: Going forward the main assignments are: I will do most of the writing, with some help from Kuluhu. Hopefully you guys can do most of the rest because writing two papers (first drafts) before July will take me almost full time. Peter and Kuluhu should help me with figures, tables, and writing the paper. Andy should try to double check my calculations and data entry once I have finalized the temporary Excel file whose .pdf is above.
As always your core classes should have definite priority but the more you can help the better. Thanks! Ray
Attached here is a .pdf that contains about 50 extra data points pulled together from various sources, and, at the bottom, about 25 comments on on different confounding factors. Probably about 25 rows will eventually change as we add, delete, and correct during the rest of this semester. However the next step is to see if all this new data breaks the nls( ) or other chunks of the code. I made a 1-ion input .csv from the information in the attached ,pdf, but could not even get the main database to read the file. Until Peter and I can straighten that our in emails everything else needs to stay on hold, so I do not yet have suggested assignments for Andy and Kululu. In principle having all that extra data is great: we will be able to write a definitive paper. But I was amazed to find how many different ways one can make a mistake during data entry.
Here is the informational .pdf. The input .csv may have to be changed so often I will for the time being write emails to Peter about it with copies to Andy and Kululu
Here is an update of the "info" .pdf of the previous comment. Such updates will continue sporadically until Hada has answered all our questions and we have double checked our data base.
Hi Peter: here are minutes of a phone meeting Edward (edwardgh@berkeley.com GitHub user name eghuang) and I just had. In a week or so I will explain to you some of the background for the proposed interactions between you and the mouse HG pod mentioned in those minutes. Don't worry about the items in the minutes until you are ready to consider dealing with them -- no hurry. Minutes of 6.24.2019.RKS_EGH_phone.pdf
Starting a new issues thread for spring semester 2018 and beyond because the old one is too long.
We need consistency with the recent paper by Ham et al. on the fibroblast data, so I am putting it here. 18sp_final_RR_Ham.pdf
Andy and Peter. In Ham et al. please study in some detail Fig. 7, Fig. 8, their captions, and the sub-section "Synergy analyses for two-ion and six-ion mixtures: 95% CI for I(d)" that contains them. I got a much broader ribbon, with much higher upper limits than in Fig. 7 from the v2 chunk of Andy's script he posted Sunday. Are you getting much broader ribbons also? If so that is a major discrepancy that we need to resolve ASAP. Ham's .Rmd is in this repository as URAP.CA/Obsolescent/CAfibroGH.Rmd and probably is in Andy's sandbox. Do you get the same vcov Ham does? Are you sampling from that the same way? If you are getting the discrepancy, Let's try to resolve it soon so we can get Ham's help if we are stuck.
There are 2 related problems.
1). I don't see in Andy's v2 chunk any comparison between the 2 different ways to compute 95% CI for I(d) (the two panels of Fig. 8 in Ham et al.). Is that comparison in chunk v1 or v3; or is it missing?
2). While running Andy's chunks I once got a figure that contains the same error Andy recently corrected, where a ribbon fell entirely below the I(d) curve instead of enclosing the I(d) curve as it must. Somewhere in Andy's script that mistake is still lurking (though perhaps commented out). The mistake should be cleaned out before it causes more confusion. More generally there needs to be a lot of quality control for the script.
Thanks! See you guys Friday I hope