Closed josemduarte closed 7 months ago
Thanks @josemduarte! Here are my immediate responses to your questions:
variable camel case vs snake case: my IDE has PEP8 which requires snake case, but I see camel case already there. Is there strong feelings about this?
Yes, in all of the ExDB code we expect camelCasing (specifically, all classes should start with a capital letter and everything else starts with a lowercase letter). We have pylinting and formatting setup to check for that.
headers and other ancillary things. I copy pasted the headers from another file. Should I follow some more strict guidelines?
The headers are less important, but looks like you followed the standard so I think it's good.
version numbers: are they set manually? or is it part of the CI pipeline
Yes they need to be updated manually. There are two files for this:
You'll need to do the same for the py-rcsb_utils_seqalign
package.
how to test this? The issue is it depends on an unreleased rcsb.utils.seqalign package. Any tricks for that?
There are two ways to test your updates—manually (locally) or automatically on Azure. Testing them locally allows you to test multiple development packages at once, without them needing to be released. You can do so by installing each package locally using the editable mode (i.e., go into the package directory and type pip install -e .
). Then, you can either go into the test directory and run the test scripts directly, or by running tox (see the note at https://github.com/rcsb/py-rcsb_db/blob/master/README.md#installation).
To be honest, I sometimes have difficulty running the test suite for py-rcsb_db
locally, as it requires several fixture
scripts to be run first, which kind of set things up (e.g., setting up a fake mongo instance, etc.). It can certainly be done, but just a warning you might have to do a bit of work to get there.
To get Azure to test things automatically, you simply need to create a PR as you've already done here. In fact, you can already see the results of your builds here: https://dev.azure.com/rcsb/RCSB%20PDB%20Python%20Projects/_build
Generally, for your situation, if it's possible to complete the seqalign package work separately first (including adding tests suites to confirm it does what you want it to do), then I'd recommend doing that first, publishing that PR when ready, and then resume work on this package. That way, Azure will pick up the latest version of seqalign to run your tests.
Otherwise, if the two are too tightly coupled, then you'll need to run the tests locally as I described above. I can also show you how to run the weekly update workflow using these code changes too, which would probably be a good idea before finalizing everything.
Hi @josemduarte, @brindakv and I were talking this morning and were considering whether this addition would better belong in py-rcsb_exdb, as you and I had contemplated last Friday. I think we should chat about this tomorrow.
Also, just FYI, it looks like there is still one remaining linting issue:
rcsb/db/cli/MmseqsSequenceClustersExec.py:26:186: E501 line too long (188 > 185 characters)
As mentioned in RO-4226, I'm going to close this in favor of https://github.com/rcsb/rcsb-redwood/pull/131 . The branch will still be there in case we want to come back to it.
@piehld this is my initial attempt. Note this requires changes in rcsb.utils.seqalign that I'll PR separately.
Right now, I'd appreciate some early feedback about this.
For instance some questions:
rcsb.utils.seqalign
package. Any tricks for that?Questions for later: