AnacletoLAB / ensmallen

🍇 Ensmallen is the Rust/Python high-performance graph processing submodule of the GRAPE library.
MIT License
38 stars 12 forks source link

conda skeleton FileNotFoundError #97

Open saadljazouli opened 3 years ago

saadljazouli commented 3 years ago

Hi, I am trying build a conda package, but when I run conda skeleton pypi ensmallen-graph, I get the following error:

with open(os.path.join(src_dir, "setup.py")) as setup:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpbdsakwvxconda_skeleton_ensmallen_graph-0.6.0.tar.gz/ensmallen_graph-0.6.0/setup.py'

I'm not sure of a better way to generate the conda recipe in order to build the package.

Any help would be greatly appreciated.

Thank you!

saadljazouli commented 3 years ago

Hi,

Thank you very much for your response. That would be great. I am also following the same guide to package ensmallen-graph, but so far unfortunately I get the error I mentioned when I run conda skeleton.

Thank you for your help!

On Thu, Jul 22, 2021 at 2:42 AM Tommaso Fontana @.***> wrote:

Hi! I'm looking into it, I'm trying to figure out how to add a dummy setup.py file in the package. My goal would be it possible to package ensmallen using this guide https://conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-pkgs-skeleton.html which you probably know already.

Is that ok?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/AnacletoLAB/ensmallen_graph/issues/97#issuecomment-884689352, or unsubscribe https://github.com/notifications/unsubscribe-auth/APUSO66FKL6FNZ3HJYIEUVTTY64TRANCNFSM5AYZ6SWQ .

iimpulse commented 3 years ago

@zommiommy you guys need any help on this one?

saadljazouli commented 3 years ago

@zommiommy @iimpulse I was wondering if you made any changes to this. When I re-run conda skeleton pypi ensmallen-graph, I now get a different error:

Error: No source urls found for ensmallen-graph

Thank you for your help!

zommiommy commented 3 years ago

Sorry for the late reply, this is pretty new to me so it took me a bit to get familiar with the task.

@iimpulse Yeah thank you! Currently what I'm trying to do is to "manually add a setup.py" inside the package so that the script might work.

@saadljazouli We didn't change ensmallen but I removed from Pypi the source package because it made my pip crash. My understanding of this is that the source package is supposed to be used to install from sources but we require maturin and the rust env so it's not compatible.

I think that our best bets are the following:

Building ensmallen so that it's well supported is not trivial (sadly), therefore I'll make a little summary of the build process so that you can better understand how all of this works.

We currently build ensmallen for the x86_64 arch for Linux, Windows, and Mac for python versions: 3.6, 3.7, 3.8, 3.9 for a total of 12 different wheels.

To be as cross-compatible as possible we follow the manylinux2010 standard which basically defines the minimum version of libraries and compilers you must support, Specifically, since we currently support any libc version newer than 2.12 (included). To remove this constraint we also tried to build the library statically using musl but we weren't able to make it work (also I think that this might cause problems due to ensmallen and python having different standard libraries).

For Linux our current building pipeline is basically this Makefile which has the recipe build_manylinux2010 which downloads and setup the container needed to build ensmallen following the manylinux2010 standard. This is needed to make the library compatible with old libcs (we support any version newer than 2.12 included), we used to build for manylinux1 but it's no longer supported by maturin. Then to build the python_manylinux2010 which build the bindings using the container.

If we need to make an actual build script I think that we could use a self-contained static installation of rust which should be just an archive to extract.

For Mac we just use Luca's laptop which has all the needed python versions installed. For Windows, I have a VM setupped for this.

For these last two we don't know yet any best practise so the build process is just basically:

$ RUSTFLAGS="-C opt-level=3 -C target-cpu=native -C inline-threshold=1000" maturin build --release

For more info see the python bindings README.

Moreover, since ensmallen is fully cross-compilable, we would like IN A DISTANT FUTURE to build for arm processors (aarch64 in particular).

Sorry if my presentation is a bit confusing, and thanks!

zommiommy commented 3 years ago

I was able to build a single version of ensmallen by writing manually the meta.yml file:

{% set version = "0.6.0" %}

package:
  name: "ensmallen_graph"
  version: "{{ version }}"

source:
  url: "https://files.pythonhosted.org/packages/1b/f2/8e26d2b1d1ea9163918d282ac7f5d86217d95e1325e0a9adac872624faa7/ensmallen_graph-0.6.0-cp36-cp36m-macosx_10_7_x86_64.whl" #[osx and x86_64 and py==36]
  url: "https://files.pythonhosted.org/packages/c0/d9/84fc710d7fcc73d54f98071cc5ec098e2c291e27141872c4d2763368e739/ensmallen_graph-0.6.0-cp36-cp36m-manylinux2010_x86_64.whl" #[linux64 and x86_64 and py==36]
  url: "https://files.pythonhosted.org/packages/3f/74/0a396df76c037b8a77748b27ab2065cd652257e7ce37f93ea7fc4385af66/ensmallen_graph-0.6.0-cp36-none-win_amd64.whl" #[win64 and x86_64 and py==36]
  url: "https://files.pythonhosted.org/packages/84/20/1d3f357b41bd2f77fc355e89bdfeabb9c67c45e3b9bce2f50313406d0a69/ensmallen_graph-0.6.0-cp37-cp37m-manylinux2010_x86_64.whl" #[linux64 and x86_64 and py==37]
  url: "https://files.pythonhosted.org/packages/2d/3a/c8817158ac660ff327adefa6b5f962f3093ea268588b651885f8db6cad74/ensmallen_graph-0.6.0-cp37-none-win_amd64.whl" #[win64 and x86_64 and py==37]
  url: "https://files.pythonhosted.org/packages/75/49/9556833120ff7a2c28690354236959962171926708211c1b38e847303e68/ensmallen_graph-0.6.0-cp38-cp38-manylinux2010_x86_64.whl" #[linux64 and x86_64 and py==38]
  url: "https://files.pythonhosted.org/packages/48/0a/b7660e547aa4537ce688d5b09d03a5aa09e968a86462be1980b2c65cb13c/ensmallen_graph-0.6.0-cp38-none-win_amd64.whl" #[win64 and x86_64 and py==38]
  url: "https://files.pythonhosted.org/packages/ca/1a/e36284a29768caca481e1acc5fc4cdcf7f69c39ba711a69cfd69c32cee7b/ensmallen_graph-0.6.0-cp39-cp39-manylinux2010_x86_64.whl" #[linux64 and x86_64 and py==39]
  url: "https://files.pythonhosted.org/packages/10/cb/cce82740d615765e8db5262d7eef78a478b67a0811071e34c3e2953c2b40/ensmallen_graph-0.6.0-cp39-none-win_amd64.whl" #[win64 and x86_64 and py==39]

build:
  number: 0
  noarch: false

  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp36-cp36m-macosx_10_7_x86_64.whl -vv #[osx and x86_64 and py==36]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp36-cp36m-manylinux2010_x86_64.whl -vv #[linux64 and x86_64 and py==36]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp36-none-win_amd64.whl -vv #[win64 and x86_64 and py==36]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp37-cp37m-manylinux2010_x86_64.whl -vv #[linux64 and x86_64 and py==37]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp37-none-win_amd64.whl -vv #[win64 and x86_64 and py==37]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp38-cp38-manylinux2010_x86_64.whl -vv #[linux64 and x86_64 and py==38]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp38-none-win_amd64.whl -vv #[win64 and x86_64 and py==38]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp39-cp39-manylinux2010_x86_64.whl -vv #[linux64 and x86_64 and py==39]
  script: {{ PYTHON }} -m pip install ensmallen_graph-0.6.0-cp39-none-win_amd64.whl -vv #[win64 and x86_64 and py==39]

requirements:
  host:
    - pip
    - python
  run:
    - python

about:
  home: "https://github.com/AnacletoLAB/ensmallen_graph"
  license: MIT
  license_family: MIT
  summary: "Rust library to run node2vec-like weighted random walks on very big graphs (~50M nodes and ~150M edges). Based on our benchmarks, our walk is ~600 times faster than Python's Networkx."
  doc_url: https://github.com/AnacletoLAB/ensmallen_graph
  dev_url: https://github.com/AnacletoLAB/ensmallen_graph

extra:
  recipe-maintainers:
    - zommiommy

To generate the url and scrtipt lines I wrote a little script:

import requests
import bs4

# Extract all the dist urls
r = requests.get("https://pypi.org/project/ensmallen-graph/#files")
soup = bs4.BeautifulSoup(r.text)
urls = [
    x["href"]
    for x in soup.find_all('a', href=True)
    if ".whl" in str(x)
]

# Utils functions
def get_selector(url):
    if "manylinux2010" in url:
        os = "linux64"
    elif "macosx" in url:
        os = "osx"
    elif "win" in url:
        os = "win64"
    else:
        raise ValueError("unknown os for '{}'".format(url))

    if "cp39" in url:
        py = "39"
    elif "cp38" in url:
        py = "38"
    elif "cp37" in url:
        py = "37"
    elif "cp36" in url:
        py = "36"
    else:
        raise ValueError("unknown py for '{}'".format(url))

    return f"#[{os} and x86_64 and py=={py}]"

def get_file_name(url):
    return url.rpartition("/")[2]

# generate the lines
for url in urls:
    print(f"url: \"{url}\" {get_selector(url)}")
print("\n\n\n")
for url in urls:
    print(f"  script: {{{{ PYTHON }}}} -m pip install {get_file_name(url)} -vv {get_selector(url)}")

Now I'll try to understand how to build for all the targets and fix the dependency problems.

Currently we need these libraries:

compress_json
tqdm
pandas
downloaders

But compress_json and downloaders are not on conda so now I'll figureout what to do

zommiommy commented 3 years ago

Adding a file called conda_build_config.yaml in the same folder with:

python:
  - 3.6
  - 3.7
  - 3.8
  - 3.9

Allows to build for the different versions, so once we figure out what to do about compress_json and downloaders we might have all we need!

zommiommy commented 3 years ago

Ok those are Luca's packages, we are going to publish them on conda so that we should be able to create the packages.

LucaCappelletti94 commented 2 years ago

Hello @saadljazouli, should we reiterate on this for the new version (and significantly updated) Ensmallen?

justaddcoffee commented 2 years ago

@saadljazouli - what would be required from @LucaCappelletti94 and @zommiommy for us to update the Ensmallen version in N3C? I think just an updated conda package?

Our code in N3C currently pinned to python 3.6 because the Ensmallen version we are using requires this. This will start becoming a problem soon I think as python 3.6 becomes obsolete

iimpulse commented 2 years ago

N3C just pulls the most recent conda version every dayish.

LucaCappelletti94 commented 2 years ago

It is still impossible to just use pip, right?

iimpulse commented 2 years ago

@LucaCappelletti94 Yes. However to get ensmallen-graph in conda should be very straightforward. The conda forge is the source for anaconda packages. It just uses the pypi distrubtion as the source from a conda recipe. See here https://conda-forge.org/#contribute

One thing to note is that I just searched the packages and there is already an ensmallen recipe (different c++ package). So think of a naming scheme for this new recipe.

LucaCappelletti94 commented 2 years ago

Could you check whether "grape" is free or taken? I'm not sure how to check exactly.

iimpulse commented 2 years ago

Grape seems avaiable. It probably makes sense to make the feedstock at the grape level anyway? All of the feedstocks are here https://conda-forge.org/feedstock-outputs/. If you want I can create the feedstock for grape?

zommiommy commented 2 years ago

I'm trying to follow what you suggested: https://github.com/conda-forge/staged-recipes/pull/20682

isuruf commented 8 months ago

grape and ensmallen should be available in conda-forge now.

conda install ensmallen_graph
conda install grape