Amber-MD / cpptraj

Biomolecular simulation trajectory/data analysis.
Other
138 stars 64 forks source link

Add 'parsetiming' command #1046

Closed drroe closed 1 year ago

drroe commented 1 year ago

Version 6.20.3. Adds the parsetiming command, which can be used to extract and report CPPTRAJ timing data from CPPTRAJ output. Intended for use in benchmarking. Updates the manual and adds a test.

  [help parsetiming]
    <filename args> ... [out <file>] [name <setname>]
    [sortby {time|cores|filename}] [includebad] [showdetails]
    [type {trajproc|trajread|actframe}] [reverse]
    [groupout <file> [grouptype {prefix|name|kind}]]
drroe commented 1 year ago

So the python failure is a weird segfault when pytest tries to run:

Run source cpptraj.sh && cd pytraj/tests && pytest -vs --ignore=test_parallel_pmap
============================= test session starts ==============================
platform linux -- Python 3.[11](https://github.com/Amber-MD/cpptraj/actions/runs/5892194564/job/15980909556?pr=1046#step:9:12).4, pytest-7.4.0, pluggy-1.0.0 -- /usr/share/miniconda/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/cpptraj/cpptraj/pytraj
double free or corruption (!prev)
Fatal Python error: Aborted

Current thread 0x00007f1d1e69b740 (most recent call first):

I think the issue here is some weirdness with conda, since I'm requesting the 3.8 environment but clearly getting conda with 3.11. I think this might be because the setup-python GitHub action is really designed for something like pip (https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python). For conda it seem a different action is more appropriate (https://github.com/marketplace/actions/setup-miniconda).

I think that to remedy this, I'm going to try to switch to pip since that seems to be recommended with GitHub actions (at least that is what's shown in the documentation). Not sure if this is a good long-term solution since Amber uses python, but at this point I just need pytraj to build and test reliably without crashing every other PR. @hainm @swails any thoughts on this?

drroe commented 1 year ago

Ugh. Using pip and python 3.10, still have a problem:

============================= test session starts ==============================
platform linux -- Python 3.10.[12](https://github.com/Amber-MD/cpptraj/actions/runs/5894396439/job/15987951974?pr=1046#step:8:13), pytest-7.4.0, pluggy-1.2.0 -- /opt/hostedtoolcache/Python/3.10.12/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/cpptraj/cpptraj/pytraj
double free or corruption (!prev)
Fatal Python error: Aborted

Current thread 0x00007f1084[16](https://github.com/Amber-MD/cpptraj/actions/runs/5894396439/job/15987951974?pr=1046#step:8:17)de00 (most recent call first):

But pytraj builds and tests fine locally for me with conda3/python 3.11. Since I can't reproduce the error I'm going to ignore it if everything else passes because this has already cost me a day of development time.

drroe commented 1 year ago

Full trace here: https://github.com/Amber-MD/cpptraj/actions/runs/5894396439/job/15987951974?pr=1046

hainm commented 1 year ago

@drroe I've triggered a PR in pytraj here and it seems fine: https://github.com/hai-schrodinger/pytraj/actions/runs/5894732127/job/15988968600

wondering what's the difference.

hainm commented 1 year ago

@drroe Let me test the build here: https://github.com/Amber-MD/cpptraj/pull/1047

drroe commented 1 year ago

wondering what's the difference.

I don't know, but the fact that it's pytest that dies makes it seem like CI is the problem, not cpptraj or pytraj.

hainm commented 1 year ago

wondering what's the difference.

I don't know, but the fact that it's pytest that dies makes it seem like CI is the problem, not cpptraj or pytraj.

um, I force to use python 3.8 here but still getting the segmentation fault: https://github.com/Amber-MD/cpptraj/actions/runs/5895317440/job/15990821342

Meanwhile, the PR in pytraj is ok: https://github.com/Amber-MD/pytraj/actions/runs/5894734864/job/15988988772

I am wondering if the segmentation fault comes from your current change or not (If yes, the CI actually does the job catching it).

cpptraj log:

image

pytraj log:

image
hainm commented 1 year ago

oh wait, I was dumb. I think this PR does not have your new change yet. lol. https://github.com/Amber-MD/cpptraj/pull/1047

trying to find out what the difference between pytraj and cpptraj github action setups.

drroe commented 1 year ago

I am wondering if the segmentation fault comes from your current change or not (If yes, the CI actually does the job catching it).

Except I can't reproduce the bug locally (conda 23.3.1, python 3.11.4). It only crashes on GitHub.

hainm commented 1 year ago

@drroe Please merge my change here to yours: https://github.com/Amber-MD/cpptraj/pull/1047

So the difference here that cpptraj uses python3.8 from anaconda (official/default) channel but pytraj uses the one from conda-forge channel. The conda-forge is more popular and more compatible nowadays: https://github.com/Amber-MD/pytraj/blob/f3cbb051f71e15d3f2b4b0d3422d4a4bfcab68e0/environment.yml#L1-L11

The change in #1047 will make sure cpptraj always use the same file as pytraj.

drroe commented 1 year ago

Thanks Hai, I'll give it a try

hainm commented 1 year ago

finally ...

drroe commented 1 year ago

It WORKED, thanks so much @hainm!

This is a really good illustration of my frustration with python. A random segfault on CI that can't be reproduced locally (even with ostensibly the same setup) that goes away simply by changing environments. It seems to happen quite often with python and it makes python development very frustrating.

Which is why I'm very glad to have someone like @hainm along who can help untangle these python knots. Cheers!

hainm commented 1 year ago

yeah, I am here to hold hand. :D