pydot / pydot-ng

Python interface to Graphviz's Dot language compatible with Python 2 nad Python 3
MIT License
22 stars 10 forks source link

Support Graphviz installed from Anaconda #52

Open prmtl opened 5 years ago

prmtl commented 5 years ago

Graphviz installed using Anaconda is using .bat scripts to correctly setup running it on Windows.

prmtl commented 5 years ago

@leycec Could you look at it? I must admit that I didn't use conda so if you can quickly test it, let me know if this is correct solution. In the meantime I'll try to prepare some CI jobs for using pydot-ng with Conda (any known projects that can I use as an inspiration?).

leycec commented 5 years ago

Humble apologies for the half-month delay, @prmtl. My awesome father-in-law unexpectedly passed away several hours before you submitted this request and, well... cue sad face emoji.

The approach you've taken here should, in theory, work. That said, it's not necessarily ideal. That said, the perfect is the enemy of the good. That said, I'm a perfectionist.

Building Utopia Together

In a perfect world, you wouldn't bother iteratively testing filetypes. Why? Because you don't need to. Also, because you absolutely shouldn't.

POSIX-compatible platforms signify executability via the executable bit (e.g., chmod +x /usr/bin/dot) rather than filetype. Moreover, .exe and .bat are Windows-specific filetypes. The current pydot-ng approach could, in obscure edge cases, erroneously attempt to run Windows-specific Graphviz executables on POSIX platforms (e.g., due to inadvisable use of a $WINEPREFIX).

Likewise, Windows users preferring pip will always want .exe-suffixed Graphviz executables to be run. .bat-suffixed Graphviz executables should only be run for Anaconda-based Windows Python interpreters; everywhere else, it's .exe or bust.

Therefore, the filetype you want is a deterministic function of the current platform + Python environment. Specifically, if the current platform is:

No iteration required. A single if conditional suffices. KISS, right?

Code or It Didn't Happen

Very well. You should now be quietly thinking to yourself: "Yes, but everything you've said assumes we can efficiently and reliably test whether the current Python environment is Anaconda-based or not. Can we?"

Fear not: we can. Sadly, I'd forgotten how. Happily, I remembered that I already solved this half a year ago (it feels like a lifetime) in a farcical comment chain at ContinuumIO/anaconda-issues#1666. It turns out that a single filesystem access suffices to decide this question:

# "True" only if this is an Anaconda-based Windows Python interpreter. See
# https://stackoverflow.com/a/47610844/2809027 for details.
is_conda = os.path.exists(os.path.join(sys.prefix, 'conda-meta'))

Given that boolean, your for prg in progs: iteration would then instead resemble:

# "True" only if this is an Anaconda-based Windows Python interpreter. See
# https://stackoverflow.com/a/47610844/2809027 for details.
is_conda = os.path.exists(os.path.join(sys.prefix, 'conda-meta'))

for prg in progs:
    if progs[prg]:
        continue

    # If this is a Graphviz executable under Windows with no filetype...
    if os.name == 'nt' and not os.path.splitext(prg)[1]:
        # If this is an Anaconda-based Python interpreter, default to the batch
        # file wrappers specific to Anaconda Graphviz packages; else, default
        # to standard Windows executables.
        prg += '.bat' if is_conda else '.exe'

    prg_path = os.path.join(path, prg)

Boom. When the comments are twice as long as the actual code, @leycec was here.

Anaconda for Great Justice

In the meantime I'll try to prepare some CI jobs for using pydot-ng with Conda

any known projects that can I use as an inspiration?

Why, yes, actually: ours.

I note that pydot-ng already leverages AppVeyor for Windows-based CI (awesome), but leveraging pip rather than conda (less awesome). Python users on Windows typically leverage conda rather than pip, for the simple reason that many packages still fail to distribute universal binary wheels suitable for use with pip under Windows. Even when such wheels are distributed, they tend to be substantially less performant than their equivalent conda packages.

The classic example is Python's scientific stack: NumPy, SciPy, Matplotlib, Pandas, TensorFlow, and so on. Whereas Anaconda's NumPy package is linked against Intel's highly optimized Math Kernel Library (MKL) on all supported platforms (including Windows), NumPy's official binary wheel is only linked against ATLAS' less performant BLAS and LAPACK implementations constrained to SSE2 instructions.

Anaconda trivially circumvents these issues. Transitioning your AppVeyor configuration from pip to conda would thus ensure that you test pydot-ng from a similar environment as that of your Windows userbase. This is a good thing. Resolving horrifying Windows + Anaconda + GraphViz inconsistencies is only the icing on the mouldy cake.

Should you choose to accept this rewarding mission, the general approach is as follows:

requirements-conda.txt

Add a new top-level requirements-conda.txt file to this repository, perhaps resembling our well-commented file of the same name. By inspection of your setup.py and test-requirements.txt configuration, an entirely untested (but hopefully useful) first working draft might be:

graphviz >=2.38.0
pyparsing >=2.0.1
pytest >=3.8.2
mock >=2.0.0

Since this is conda rather than pip, note the inclusion of non-Pythonic mandatory dependencies – namely, GraphViz. Chocolatey's cinst package manager is neither required nor desired here. In fact, Chocolatey should be entirely ignored. Let Anaconda do the heavy lifting, for that is its job description.

More importantly, whitespace is significant. Ergo, graphviz >=2.38.0 is a syntactically valid conda package specification but the seemingly similar string graphviz>=2.38.0 is not. Well, nuthin's perfect.

appveyor.yml

So, you already have an Appveyor configuration. That's great! Sadly, you'll need to completely rewrite it from the ground up to support conda rather than cinst + pip. That's perhaps not so great.

On the bright side, "completely rewrite it from from the ground up" just means "liberally copy-and-paste somebody else's." Here's ours, for example... complete with verbose commentary guaranteed to make your head explode like a jelly-filled piñata on the Day of the Dead.

Our test suite even uses py.test like yours. Heaven is indeed a place on GitHub.

Quotes... from Hell

Let's get pedantic. You really don't want to reinvent the wheel by attempting to manually implement shell escaping, quoting, and unquoting – especially under Windows, where the syntactic ruleset is infamously inhumane.

You'll never get it quite right in a genuinely portable manner. But you don't need to. Whenever you want to munge a string for consumption by a shell interpreter, defer to the existing subprocess and shlex modules in the Python stdlib instead. Our multiphysics biology simulator, for example, defines the following cross-platform function for doing so:

def shell_quote(text):
    '''
    Shell-quote the passed string in a platform-specific manner.

    If the current platform is:

    * *Not* Windows (e.g., Linux, OS X), the returned string is guaranteed to be
      suitable for passing as an arbitrary positional argument to external
      commands.
    * Windows, the returned string is suitable for passing *only* to external
      commands parsing arguments in the same manner as the Microsoft C runtime.
      While *all* applications on POSIX-compliant systems are required to parse
      arguments in the same manner (i.e., according to Bourne shell lexing), no
      such standard applies to Windows applications. Shell quoting is therefore
      fragile under Windows -- like pretty much everything.

    Parameters
    ----------
    text : str
        Arbitrary string to be shell-quoted.

    Returns
    ----------
    str
        The passed string shell-quoted.
    '''

    # If the current OS is Windows, do *NOT* perform POSIX-compatible quoting.
    # Windows is POSIX-incompatible and hence does *NOT* parse command-line
    # arguments according to POSIX standards. In particular, Windows does *NOT*
    # treat single-quoted arguments as single arguments but rather as multiple
    # shell words delimited by the raw literal `'`. This is circumventable by
    # calling an officially undocumented Windows-specific function. (Awesome.)
    if os.name == 'nt':
        import subprocess
        return subprocess.list2cmdline([text])
    # Else, perform POSIX-compatible quoting.
    else:
        import shlex
        return shlex.quote(text)

Please steal this. Also, you mispelled "qouted;" it's "quoted." shoot me now

All's Well that Ends

Thus ends another doctoral thesis masquerading as GitHub commentary. In reflection, I regret nothing.

prmtl commented 5 years ago

Thanks for the response! I really appreciate it. And I'm sorry for your loss.

I just quickly went through it and I found a LOT of new info about Windows & Anaconda. I'll slowly go through them later, read more about the topic and reimplement PR.

leycec commented 5 years ago

You're most welcome. My wife and I gratefully appreciate the outpouring of support. You're awesome!

Since (A) you're awesome, (B) Anaconda now defaults to Python 3.7, (C) pydot is moribund dead and therefore unlikely to ever support Python 3.7, (D) BETSE requires a pydot analogue for gene regulatory network (GRN) support and therefore currently also does not support Python 3.7, and (E) I am begrudgingly willing to work long feverish weekends to resolve this blocker, I'd like to begin contributing to pydot-ng.

Fate has tipped my gnarled arthritic hand.

Big Plans for Big Men

I accidentally became an idiot savant at all things Anaconda several months ago. Notably, I'm a co-maintainer on the conda-forge pydot feedstock and the sole creator and maintainer of two other feedstocks. Given this obscure specialization in the conda ecosystem, here's my ad-hoc outline for resolving our mutual issues:

  1. @leycec creates a new pydot-ng feedstock. In conda-forge parlance:
    • A "feedstock" is a public GitHub repository with URL https://github.com/conda-forge/${project_name}-feedstock containing the recipe required to build and test the Anaconda package for that project. You will, of course, be granted full administrator access to this repository. You namedropped another pydot-ng developer. They will, of course, also be granted push access. The more unpaid volunteer contributors the better for all, eh?
    • A "recipe" is a YAML-formatted file specifying exactly how to build and test a specific version of a specific open-source project for all supported platforms. It's mildly analogous to an appveyor.yml file – except rather than merely testing a project on a single platform, a conda-forge recipe both builds and tests that project on all possible platforms. Impressive conda-forge automation then automatically generates continuous integration (CI) specifications for all supported platforms and, assuming tests magically pass, publishes an Anaconda package for that version of that project under the conda-forge channel. Pheeeeew.
  2. Either @leycec or @prmtl whichever comes first submits a pull request with this repository resolving Anaconda integration. I'd be happy to do so after getting the conda-forge feedstock for pydot-ng up and running. Alternately, you're welcome to tackle this sooner if you have the copious free time and gritty desire. I recommend playing video games instead.
  3. @prmtl publishes a new stable pydot-ng release with PyPI, hopefully including that Anaconda fix. What are we up to now – say, version 2.0.1?
  4. Either @leycec or @prmtl whichever comes first formally notifies pydot contributors of the reactivation of pydot-ng and likely death of pydot. Nobody else appears to have noticed that pydot is on its last ventilator. I note three (!) new pydot pull requests by @hugovk, which deeply saddens me. @hugovk is an official member of Pillow, a high-profile Python imaging framework. It'd be fabulous to help him and/or her up onto the pydot-ng hype train instead – even if only temporarily. One way we might go about this is by submitting a new pydot issue politely declaring the probable death of pydot and suggesting viable well-maintained alternatives. (Well, pretty much just us.) An official community announcement of some sort could lend pydot-ng that much-needed momentum.

That's just how we roll.

But... Why conda-forge?

It's a fair question. There already exists an official Anaconda package for pydot-ng, after all. Sadly, official Anaconda packages are largely useless for smaller-scale projects. Why? Because:

My paternal advice is to stridently ignore the existing Anaconda pydot-ng package. Creating conda-forge feedstocks is dramatically preferable. Since I'm on congenial terms with conda-forge staff, I'd be as giddy as a delinquent monkey pilfering its first plantation banana to facilitate this process for you.

tl;dr

I create conda-forge pydot-ng feedstock. You or I or somebody create new pydot-ng pull request fixing Anaconda integration. You publish new pydot-ng release with PyPI including this fix. You or I or somebody create pydot issue (or something) formally notifying the community of "the switch-over."

Good plan? Bad plan? A little bit of both? Cogitate and let me know! Until then, thanks for all the volunteerism and have a raucous post-Halloween weekend. :jack_o_lantern:

hugovk commented 5 years ago

Don't worry too much about my PRs, I'm not using pydot myself, rather helping out with Python upgrades for some of the most-downloaded PyPI packages.

Pydot has half a million a month:

https://hugovk.github.io/top-pypi-packages/

leycec commented 5 years ago

Thanks for the sacrificial volunteerism, @hugovk! I hate to be the lone pallbearer of sad news, but...

erocarrera/pydot is unofficially dead. The sole maintainer, Ioannis Filippidis (@johnyf), went "dark" shortly after completing his doctoral thesis. Dr. Filippidis ceased all online activity at the same exact time. Prior to April 2018, he was heavily active on GitHub, StackOverflow, and Twitter; as of April 2018, he abruptly disappeared from all three sites. According to LinkedIn, he's currently a postdoctoral researcher with Inria. This is a mild relief. He doesn't appear to be physically dead, which is good – only completely inactive online, which is bad. (See erocarrera/pydot#183 for prior discussion.)

In the meanwhile, no one else has push access to erocarrera/pydot. I wouldn't bother posting any further issues or pull requests to that repository. All contributions are being redirected towards /dev/null.

Let us shed a collective tear for the Python graphing library that might have been. :cry:

hugovk commented 5 years ago

Ah, so this repo is a fork? It wasn't clear to me.

@erocarrera: do you have push access to https://github.com/erocarrera/pydot?

At least Ero is a maintainer at https://pypi.org/project/pydot/

And looks like there's a handful of forks! https://pypi.org/search/?q=pydot

"pydot" is still the most popular: https://hugovk.github.io/top-pypi-packages/

erocarrera commented 5 years ago

I do have push access to https://github.com/erocarrera/pydot anybody wants to jump in? let me know if I need to point pydot in pypi to a working fork or if anybody wants to step in as a maintainer.

leycec commented 5 years ago

OMG. The Master awakens from his slumber. We've been desperately trying to contact you and @johnyf for the past several months. So, this is a wondrously unexpected development. Thanks for responding so promptly... and invitingly.

What do you think, @prmtl? Are you up for maintaining the official pydot repository? This is also the perfect opportunity for us to debate transferring ownership from erocarrera/pydot to pydot/pydot. @prmtl already manages GitHub's pydot organization, which currently only contains this fork. If everyone's amenable, here's how this transfer might play out:

  1. @prmtl adds @erocarrera to the pydot organization.
  2. @prmtl enables the Allow members to create repositories for this organization option for the pydot organization. ...at least, temporarily
  3. @prmtl grants @erocarrera repository creation permissions.
  4. @erocarrera transfers ownership from himself to the governing pydot organization, which he would then be a member of.
  5. @erocarrera grants @prmtl PyPI permissions to push new stable releases. Maybe? Does PyPI support that sort of thing? As in most aspects of life, I'm clueless on this part.

It is good, yes? @erocarrera would retain most (...all?) permissions, @prmtl would receive new administrator permissions, and all existing issues, pull requests, and wiki pages would be preserved as is.

If Sebastian Kalinowski (@prmtl) is willing and able, he has my vote for new pydot maintainer; else, I reluctantly nominate myself, Cecil Curry (@leycec). Since all existing stable releases of pydot are incompatible with Python ≥ 3.7, it's vital that somebody step up to the plate, begin merging pending pull requests, and produce a new stable release.

Go, Team PyDot! Go!

erocarrera commented 5 years ago

Good plan! Let me know when I can proceed with the transfer of ownership.

prmtl commented 5 years ago

Sounds like a plan! I really prefer to maintain pydot over pydot-ng since the former can be called as "industry standard" and I remember the mess we have few years ago with bunch of forks (try to convince libs maintainers to change their requirements again :)).

I've invited @erocarrera and @leycec to the organisation and enabled permission that allow members to create repositories.

erocarrera commented 5 years ago

Thanks! I've transferred pydot to @prmtl. GitHub warned me that a repository with that name already existed for prmtl, so I renamed it to pydot-classic. Hope that's fine.

prmtl commented 5 years ago

@erocarrera I think that you should transfer it to pydot organisation not me personally. I've also renamed the repo on my account that gave you this warning, so if transferring to organisation will not work, you can try again with transfer to my account.

erocarrera commented 5 years ago

It should be done, I've transferred it to the pydot organization this time.

prmtl commented 5 years ago

Awesome!

prmtl commented 5 years ago

Update: I've merged fix for Python 3.7 and dropped support of Python 2.6. Now I'm ready to make a release. I've just need permissions to pydot's PyPI.