Open chubukov opened 7 years ago
A similar issue is that I worry that we're getting away from the object-oriented approach. Here's an example function definition (just one example of many).
double PeakGroup::getExpectedMz(int charge, map<string, bool> isotopeAtom, int noOfIsotopes)
This function seems to get called with arguments taken from main window parameters, and it checks if the isotope that the user is looking for is within the maximum M+n. Why does any of this need to happen here? It seems that PeakGroup::getExpectedMz()
should really be available with no arguments at all -- all the parameters (like charge, if needed) should be already available in the PeakGroup object. And whether this particular PeakGroup is an isotope that's within the allowed scope is a check that should happen completely outside, if it even needs to happen.
Similar examples seem to be popping up a lot. Again, this just goes to modularity -- if we can make the organization more consistent it will make it easier for more people to develop concurrently.
It may be that I'm missing something about what these functions need to do and why they're organized the way they are. But if that's true, then I fear almost no one will be able to follow the logic.
@chubukov We are setting up contributing/code guidelines for developers/contributors. Also, we'll put in place PR review guidelines. We have identified the following steps to improve the quality of code:
We'll refer some good open-source c++/qt repositories like
We'll make sure that the following are in place in Sprint 8.
@sahil21 that sounds good, and will probably help. I do think that we might have to do some more global refactoring at some point to really address some of the messiness that's been present for a long time.
I'm wondering what we can do to improve code organization and modularity and how we should organize the effort, assuming it's worth doing. The thing that most concerns me is the amount of repeated code. For instance, the logic for the new EIC sum/max feature is repeated basically identically in three different places. As another example, the code for pullIsotopes() and pullIsotopesBarPlot() is 90% identical. This is not limited to recent changes by any stretch, there are tons of legacy code segments that have identical logic implemented in multiple parts of the code.
I think it's a problem because it makes it much harder to get started with the code, and it makes concurrent development much trickier. It also makes it harder to track down bugs in new features.
I think the original refactor that started ElMaven made some significant improvements in this regard from original Maven, but I don't entirely remember.