Closed rollf closed 1 year ago
@etal Could you have a look into this when time permits?
(I believe the conflicts arose due to new formatting)
Thanks! I'm reviewing this PR now. Sorry for forcing a rebase, but hopefully the codebase is a little easier to work with now.
@etal Thanks for reviewing. You haven't said whether you're globally fine with my approach but I assume it's okay then. I will change the code towards adding the treat_par...
parameter into the meta
dict. This should then already clean up things a little bit.
You haven't said whether you're globally fine with my approach but I assume it's okay then. I will change the code towards adding the
treat_par...
parameter into themeta
dict.
Yes, I'm globally fine with your approach, and I appreciate your efforts here. My only restriction in this PR is that read_cna(filename)
and CNA(table)
should work as expected on a non-human genome like yeast. For human-specific behavior I prefer named keyword arguments over positionals, but you can see I haven't been consistent with that either, so I won't fuss too much about it.
Hey @etal,
thanks again for your feedack. I have been working on other topics but I had also tried to make headway here. However, I was really unsatisfied with my implementation. Mainly, because I feel like I am not able ensure consistency (in terms of behavior) across the whole code base (i.e. I can't make sure that the necessary parameter(s) is/are always available when creating/dealing with CopyNumberArray
objects). I'm currently pursuing a more global approach and will of course get back to you once I have something to show.
@etal I worked through the code base again and I'm now done. Can you please carefully review this again? I have updated the initial PR comment as well as added further comments since I have changed my mind and found a better way (IMHO) to implement this optional feature. The tests are currently failing due to dependency issues (see my comment in core.txt
). I had also used blackd
but refrained from doing so because I felt like I'm not using the same configuration (some parts of the code were formatted which I didn't touch).
Here are 2 x 3 plots. For the male & the female sample from a mixed pool, I have plotted chr X once based on the current master branch, once based on my branch without the feature enabled and once with the feature enabled. The male bias towards a gain and the female bias towards a loss are fixed (last plot for each sex).
male:
female:
Hi @etal,
thanks for the feedback. I renamed the file as you suggested. There are still two tests failing. For me, this looks like a dependency issue (after pinning pomegranate). Let me know how you would like to handle this.
Hi @etal,
I will be on a longer summer break for July & August. Do you think we could finish this before end of June? I'm happy to support you if needed but I don't see anything I could do for now.
Sure. I'm not too fond of the tests py3.8-min
and py3.10-macos
. Ignoring those tests, do you feel this PR is ready to merge?
@etal Yes, this is ready for merge IMHO.
All done, squashed and merged. Thanks for your contribution!
Yeah! :rocket:
This PR addresses issue #789. It adds the optional feature to treat the PAR1/2 on chromosome X as autosomal, i.e. when requesting autosomal regions these regions are included and on the other hand, when requesting "chromosome X", they are excluded. For chr Y, the PAR1/2 regions are excluded when requesting "chromosome Y" since they cannot be covered if all reads map to PAR1/2 on X. The results have already been shown in the issue mentioned above.
This PR contains the following changes:
--diploid-parx-genome
which can be set togrch37
orgrch38
in order to treat PAR1/2 GRCh37/38 regions on chromosome X as autosomal.GenomicArray.autosomes()
has been re-implemented inCopyNumArray.autosomes()
to be able to transparently include/exclude PAR1/2 on demand.(cnarr.chromosome == cnarr._chr_x_label)
), this has been replaced by a call toCopyNumArray.chr_x_filter()
which includes/excludes PAR1/2 as needed. Same for chromosome Y.chrx = self[self.chrx_filter()]
instead ofself[self.chromosome == self.chr_x_label]
(or even `self[self.chromosome == "chrX"]) - but it's in the nature of Python to give full access to object members.call.absolute_dataframe()
.read_ga
similar toread_cna()
and use where appropriate (=GenomicArray tests)I will add some further review comments.