Closed AprilYUZhang closed 9 months ago
In commit # 0632f47, I copied the get_input_options()
from tinyhouse, but I just changed the input name when it parsed parameters. The reason why I didn't change the name in the parse_dictionary
, pedigree, genotypes, seqfile, bfile
are involved many times in the program calls of tinyhouse.
I found a recombination rate table including human, pig, rat, and so on. Aimed at different species, we need different recombination rates, It may be useful Dumont BL, Payseur BA. Evolution of the genomic rate of recombination in mammals. Evolution. 62:276, 2008.
Thanks for sharing. There are now even newer recombination rates for some species.
In commit # 0632f47, I copied the
get_input_options()
from tinyhouse, but I just changed the input name when it parsed parameters. The reason why I didn't change the name in theparse_dictionary
,pedigree, genotypes, seqfile, bfile
are involved many times in the program calls of tinyhouse.
@AprilYUZhang @XingerTang I see you are challenged quite a bit by trying to keep tinyhouse
intact, while changing AlphaPeel
. I spoke to Hannes today (@bryo-han) and he was of the view that you are effectively doing double work - now on AlphaPeel side and then in the future we will have to fix tinyhouse
again. What about actually breaking tinyhouse
for other programs? All other programs bring in tinyhouse
dependency that is linked to a particular commit via a git submodule. So, if we push development of tinyhouse
further, we will not immediately break other programs. Once we do the changes we need in AlphaPeel and tinyhouse we can always go an fix any other AlphaX program so that it uses the new tinyhouse
version. This should simplify things a lot!? Thoughts?
In commit # 0632f47, I copied the
get_input_options()
from tinyhouse, but I just changed the input name when it parsed parameters. The reason why I didn't change the name in theparse_dictionary
,pedigree, genotypes, seqfile, bfile
are involved many times in the program calls of tinyhouse.@AprilYUZhang @XingerTang I see you are challenged quite a bit by trying to keep
tinyhouse
intact, while changingAlphaPeel
. I spoke to Hannes today (@bryo-han) and he was of the view that you are effectively doing double work - now on AlphaPeel side and then in the future we will have to fixtinyhouse
again. What about actually breakingtinyhouse
for other programs? All other programs bring intinyhouse
dependency that is linked to a particular commit via a git submodule. So, if we push development oftinyhouse
further, we will not immediately break other programs. Once we do the changes we need in AlphaPeel and tinyhouse we can always go an fix any other AlphaX program so that it uses the newtinyhouse
version. This should simplify things a lot!? Thoughts?
@gregorgorjanc
I understand that we may need to do these changes in tinyhouse
later as well, but @AprilYUZhang has already finished all the modifications related to the tinyhouse
. And the update of the option names of tinyhouse
is more difficult than that of alphapeel
as the option names are used everywhere among the 10 files of tinyhouse
. It would be more likely for us to miss some of them while modifying the code.
Meanwhile, tinyhouse
may still have some bugs unfixed, using different versions of it means we need to introduce changes to different versions of it every time we want to make an update.
I think it would be better if we just stick with this "double work" for a while and then when we are ready to make some big changes to the tinyhouse
, we can do everything at once.
the update of the option names of
tinyhouse
is more difficult than that ofalphapeel
as the option names are used everywhere among the 10 files oftinyhouse
. It would be more likely for us to miss some of them while modifying the code.
@XingerTang why would this be hard? It's mostly an input and maybe some variable name change, which is less than doing extra logic, particularly in https://github.com/AlphaGenes/AlphaPeel/pull/122/commits/5821d432e533277590a2b620dfbb4d8feb76708e. The part that I am worried is that while these current changes are not large, they are adding extra logic to AlphaPeel
and when we decide to polish tinyhouse
we will forget that some of this extra logic should be removed from AlphaPeel
.
the update of the option names of
tinyhouse
is more difficult than that ofalphapeel
as the option names are used everywhere among the 10 files oftinyhouse
. It would be more likely for us to miss some of them while modifying the code.@XingerTang why would this be hard? It's mostly an input and maybe some variable name change, which is less than doing extra logic, particularly in 5821d43. The part that I am worried is that while these current changes are not large, they are adding extra logic to
AlphaPeel
and when we decide to polishtinyhouse
we will forget that some of this extra logic should be removed fromAlphaPeel
.
@gregorgorjanc
It would be hard doesn't because it's hard to write anything, but the code of AlphaPeel
and tinyhouse
are messy, and it is hard to track where the option names go. For those option names that are adding/modifying functionality, it is not enough to do "search and replace" as we need to find everywhere it is used and understand how it is used.
In the above, @AprilYUZhang found where the errors are updated and modified the line of the condition for estimating genotype error and sequence error before that line,
from
if args.esterrors:
PeelingUpdates.updatePenetrance(pedigree, peelingInfo)
to
if args.est_geno_error_prob or args.est_seq_error_prob:
PeelingUpdates.updatePenetrance(pedigree, peelingInfo)
but that is not enough, we need to look up quite closely to see which line corresponds to the update of genotype error and which line corresponds to the update of sequence error inside PeelingUpdates.updatePenetrance(pedigree, peelingInfo)
and add additional conditions for them, even though there is no args.esterrors
in it. We need to be quite familiar with the code to do all the changes without missing anything. And, when we are facing tinyhouse
, there are 10 files. It is easy to lose track of where and how some values are passed from one place to another, but we need to understand the uses of all of them before making changes. That is the time-consuming part.
Also, we don't want code inside tinyhouse
inconsistent, so we cannot modify only the AlphaPeel
related part of tinyhouse
.
When we polish tinyhouse
, we also have to check every place of AlphaPeel
that is related to tinyhouse
to avoid inconsistency, so we won't miss any extra logic.
@XingerTang @AprilYUZhang I think I understand now. Ok, let's continue and be mindful of the steps beyond the current step. I appreciate your efforts!
I find ".called." is not fixed in writeBinaryCalledGenotypes
. Although it hasn’t been realized, do we need to change it? @gregorgorjanc @XingerTang
I find ".called." is not fixed in
writeBinaryCalledGenotypes
. Although it hasn’t been realized, do we need to change it? @gregorgorjanc @XingerTang
@AprilYUZhang It's not important for now. I prefer to leave that to the future.
I find ".called." is not fixed in
writeBinaryCalledGenotypes
. Although it hasn’t been realized, do we need to change it? @gregorgorjanc @XingerTang@AprilYUZhang It's not important for now. I prefer to leave that to the future.
@AprilYUZhang open an issue and we get back to it in future
I updated the summary of changes #113
the problem of these two options are similar toIn commit #0632f47. I just changed the part parsing argument rather than all. When we consider to change tinyhouse, it will be fixed in the whole software.
In commit # 0632f47, I copied the
get_input_options()
from tinyhouse, but I just changed the input name when it parsed parameters. The reason why I didn't change the name in theparse_dictionary
,pedigree, genotypes, seqfile, bfile
are involved many times in the program calls of tinyhouse.
the problem of these two options are similar toIn commit #0632f47. I just changed the part parsing argument rather than all. When we consider to change tinyhouse, it will be fixed in the whole software.
In commit # 0632f47, I copied the
get_input_options()
from tinyhouse, but I just changed the input name when it parsed parameters. The reason why I didn't change the name in theparse_dictionary
,pedigree, genotypes, seqfile, bfile
are involved many times in the program calls of tinyhouse.
@AprilYUZhang I don't know to which question/comment you are referring to with this answer. Can you clarity? Is this related to my comment in this thread https://github.com/AlphaGenes/AlphaPeel/pull/122#discussion_r1399369654?
Sorry, I expressed it ambiguously. It is indeed easy to be confused with rec_prob
because I don't mention which option I'm talking about. I mean out_id_order
, out_id_only
is like pedigree
, phasefile
. These options involved the tinyhouse.
@AprilYUZhang I think this looks very good all in all! Are there any more planned changes - it seems you have done all of the points in https://github.com/AlphaGenes/AlphaPeel/issues/113!?
@XingerTang I would appreciate quick code review from your end before we merge this in.
@AprilYUZhang @gregorgorjanc There are still some changes that need to be made before we merge. Please see the above comments.
@AprilYUZhang @gregorgorjanc I just noticed that we still have an s
in n_cycles
which is the updated name of ncycles
. Should we keep it or change it to n_cycle
?
(also the n_threads
and n_io_threads
)
@AprilYUZhang @gregorgorjanc I just noticed that we still have an
s
inn_cycles
which is the updated name ofncycles
. Should we keep it or change it ton_cycle
?
@XingerTang well spotted! @AprilYUZhang please check all the options and let us know what you suggest as a unified approach regarding the plural and the associated “s”.
Solved the above problems.
@gregorgorjanc @AprilYUZhang Sorry I forgot the commit here is not up to date and sent the wrong comment, I already deleted that. @AprilYUZhang Could you pull the recent changes that just merged into the branch, and let's see how the test go
@AprilYUZhang Could you pull the recent merged commits again? Let's see how the tests run :)
@gregorgorjanc Could you please merge this pull request?
@gregorgorjanc Could you please merge this pull request?
@XingerTang i lost track of which of the above discussions and requested changes have been addressed l please go over them and resolve each one (maybe the changes have all been achieved?) and then we merge this in,
@gregorgorjanc Yes, all the changes are achieved and every problem is addressed. I think we can merge this in.
I found a recombination rate table including human, pig, rat, and so on. Aimed at different species, we need different recombination rates, It may be useful Dumont BL, Payseur BA. Evolution of the genomic rate of recombination in mammals. Evolution. 62:276, 2008.