facelessuser / soupsieve

A modern CSS selector implementation for BeautifulSoup
https://facelessuser.github.io/soupsieve/
MIT License
205 stars 38 forks source link

missing dependency on `bs4` #251

Closed asottile-sentry closed 1 year ago

asottile-sentry commented 2 years ago
$ pip install soupsieve
Collecting soupsieve
  Downloading soupsieve-2.3.2.post1-py3-none-any.whl (37 kB)
Installing collected packages: soupsieve
Successfully installed soupsieve-2.3.2.post1
$ python3 -c 'import soupsieve'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/y/venv/lib/python3.10/site-packages/soupsieve/__init__.py", line 29, in <module>
    from . import css_parser as cp
  File "/tmp/y/venv/lib/python3.10/site-packages/soupsieve/css_parser.py", line 5, in <module>
    from . import css_match as cm
  File "/tmp/y/venv/lib/python3.10/site-packages/soupsieve/css_match.py", line 7, in <module>
    import bs4  # type: ignore[import]

I understand soupsieve is intended to be used with beautifulsoup but it isn't even importable without it installed -- soupsieve should probably have an install_requires (or whatever the hatch-equivalent is) that include beautifulsoup4

facelessuser commented 2 years ago

Yep, nor will it be. The reason is we didn't want to create a circular dependency. Originally, we weren't sure that it would be accepted as the official CSS selector library of BeautifulSoup. If it had not been accepted, we would have had to require bs4 as a dependency.

Generally, it is expected that you will get the package by installing bs4, but if you don't, you will need to install bs4.

asottile-sentry commented 2 years ago

is there a problem with a cycle there? it's already cyclical but the metadata doesn't match that

facelessuser commented 2 years ago

Most people aren't going to install soupsieve directly.

Now, this shouldn't be a surprise either as we do document this fact here and here.

I have had some projects that have created this circular dependency before, and this is especially problematic for people using tools to pin certain package versions, like when using Poetry or other package management systems. In these cases, people will complain about the circular dependency.

So, this is a lose-lose situation for me. I'll have people complain if I don't create the circular link, and I'll have people that will complain if I do create the circular link. So, end the end, I chose the path that doesn't actually break anything. Yes, if you don't have bs4, soupsieve won't work, but if you install bs4, it will work fine. The alternative is I break things like Poetry with no resolution.

asottile-sentry commented 2 years ago

in my experience poetry handles cycles fine -- do you have a link to a failure there?

facelessuser commented 2 years ago

I would have to dig up failures as it was in a different project. Whether it was specifically poetry or some other pip locking library, I don't recall, but what I do recall were very real complaints. I am not looking to revisit and cause the same problems with this project.

facelessuser commented 1 year ago

As the conversation on this has died out, I will close this with the following comment:

Soup Sieve was originally created to accept BS4 objects/elements and to apply selectors, but the hope was always to replace the official CSS selectors in BS4. Since it has been accepted as the new official CSS selector support in BS4 and is a required dependency of BS4., Soup Sieve can really be viewed as a sub-module of Beautiful Soup as it is not optional and is not useful by itself.

Soup Sieve does offer some optional ways to apply selectors that are not built into Beautiful Soup such as closest, etc. These require you to explicitly import those methods, but Soup Sieve is still considered a part of Beautiful Soup at this point. If BS4 made CSS selectors optional, then, and only then, would we add the BS4 requirement to this package.

facelessuser commented 1 year ago

Just as an FYI, some of the issues related to circular dependencies were installing packages manually: https://github.com/squidfunk/mkdocs-material/discussions/2591. Additionally, there are tools like Pyodide which have to install packages manually. SoupSieve is supported there as well: https://github.com/pyodide/pyodide/tree/main/packages/soupsieve. They would have to add special exceptions to handle this odd, two-way dependency.

I didn't find the specific example of other package managers, I'm not sure where the original conversation was, but I'm still fairly certain it was an issue at some time in at least one of them.