Pebaz / nimporter

Compile Nim Extensions for Python On Import!
MIT License
824 stars 33 forks source link

How to bypass compiling tests? #23

Closed Benjamin-Lee closed 4 years ago

Benjamin-Lee commented 4 years ago

For background, since I've filed a few issues, I'm working on a library to analyze DNA and using this library to study the coronavirus genome. The pure Python version was quite slow, so I rewrote it in Nim and have achieved great (1000x+) speedups. I followed your advice in #20 and am just about ready to publish the librtd library and CLI.

Currently, I'm getting an error when testing my CLI (which is done through the Python interface) on Windows. What is weird is that the error is coming from my tests:

LINK : error LNK2001: unresolved external symbol PyInit_tests
220
    build\temp.win-amd64-3.8\Release\nim-extensions\tests.tests\tests.cp38-win_amd64.lib : fatal error LNK1120: 1 unresolved externals
221
    error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.26.28801\\bin\\HostX86\\x64\\link.exe' failed with exit status 1120

Since the tests don't need to be compiled, is there anyway to turn that off? Alternatively, if you have suggestions for fixing the error, I would really appreciate it. Thanks again for such a great library!

Pebaz commented 4 years ago

Great project! I gave it a star ⭐️

This is an interesting one.

The PyInit_tests refers to a Python extension being loaded only to find that the init module function is not defined.

When using Nimpy, this is automatically handled for you when you import nimpy and have at least one function decorated with {.exportpy.}.

However, in your code, I do not see Nimpy being imported at all.

Can you tell me more about how you are using your Nim library in Python?

Nim libraries cannot be directly used. They have to have a Nim file within your Python project that imports the Nim library. Within that Nim file, you then import Nimpy and decorate any wrapper functions with {.exportpy.}.

Alternatively, the way I would go is within your Python project, create a folder containing a .nim and .nimble of the same name as the folder. Within the .nimble, you can add your Nim library as a dependency.

As far as I am able to tell from the information above, Python/Nimporter is trying to import your pure-Nim code which is not (as far as I can see in your project) decorated with the {.exportpy.} from Nimpy.

Let me know how I can further assist!

Benjamin-Lee commented 4 years ago

Sorry for the confusion and thanks for the help (and the star)! I was working on a branch to develop the Python interop, which I have since merged into master. If you take a look now, you should see a directory structure like this (some files omitted):

.
├── librtd
│   ├── __init__.py
│   ├── cli.nim
│   ├── cli_wrapper.py
│   └── librtd.nim
├── librtd.nimble
├── setup.py
└── tests
    ├── config.nims
    └── tests.nim

In librtd.nimble, I do list nimpy > 0.1.0 as a dependency. This would probably explain why it's working on my laptop as well as on Ubuntu. Interestingly, none of the builds are failing except for the Windows, as seen here.

With respect to your suggestions on directory structure, there is a very good chance that mine is wrong, since I'm getting this warning:

 Warning: Package 'librtd' has an incorrect structure. The top level of the package source directory should contain at most one module, named 'librtd.nim', but a file named 'cli.nim' was found. This will be an error in the future.
16
      Hint: If this is the primary source file in the package, rename it to 'librtd.nim'. If it's a source file required by the main module, or if it is one of several modules exposed by 'librtd', then move it into a 'librtd/' subdirectory. If it's a test file or otherwise not required to build the the package 'librtd.nim', prevent its installation by adding `skipFiles = @["cli.nim"]` to the .nimble file. See https://github.com/nim-lang/nimble#libraries for more info.

Do you I understand you correctly that you suggest moving the CLI and the Python imports to a different folder from the main librtd.nim file? Also, do you think that this might be the reason that the tests are generating the error on Windows?


It may help to explain my desired outcome. As I mentioned before, when analyzing coronavirus genomes, my collaborators and I found that the pure Python implementation was much too slow. I decided to give Nim a try and was very pleasantly surprised with how easy to write and fast Nim is. This is my first "production" ready Nim project.

Unfortunately, the biology world is still highly dependent on Python. Therefore, my goal is as follows:

  1. Be able to call the functions defined in Nim and use the CLI from a pip-installable Python package.
  2. Share the project on the Nim Package Directory.

Goal 1 is more important in the short term so that I can more easily collaborate on the coronavirus research although I do want to build up the Nim bioinformatics ecosystem. Thanks again for all the help!

Pebaz commented 4 years ago

Thank you for taking the time to explain!

With this new information, my recommendation is as follows:

Since your goal is to have both a Nim-compatible and a Python-compatible library, you must place these in separate repositories.

The Nim library should behave exactly the same with no modifications whatsoever.

The Python library should have a folder (that acts as a single module to the eyes of Python) that should have the directory structure:

librtd-py/
    librtd/  # Even though the name is the same as the Nim lib this will not affect anything
        librtd.nimble  # requires "librtd"  <- This is key. You can depend on your Nim lib and also have it be accessible by the Nim community
        librtd.nim  # Import nimpy, {.exportpy.}, wrapper functions around Nim lib
    librtd.py  # import nimporter, import librtd
    setup.py  # ext_modules=nimporter.build_nim_extensions(),

Some more background on this:

In Nimporter, you cannot have a *.nimble file in a directory that is not an extension library. If a *.nimble file is found in any folder, that folder is treated like a library. Similarly, any Nim file that is by itself (alongside Python code) that is not contained within a folder of the same name, that is not adjacent to a Nimble file of the same name, is treated exactly as a Python module (except for the fact that it is written in Nim of course).

I hope this helps and if I can clarify anything let me know!

Benjamin-Lee commented 4 years ago

Do I understand you correctly that there is no way to keep all of the code in a single repository? Or is it possible to have the code in a single repository, just in different directories? Assuming that you meant different directories within the same git repo, I tried using your directory structure:

├── LICENSE
├── README.md
├── librtd
    ├── librtd.nimble
│   └── librtd.nim
├── librtd-py
│   ├── cli.nim # import librtd (shouldn't care whether its a wrapper function with the same API)
│   ├── cli_wrapper.py # invokes cli.nim
│   ├── librtd
│   │   ├── librtd.nim # import librtd, import nimpy, {.exportpy.} wrapper functions around the Nim library
│   │   └── librtd.nimble # requires "nimpy", requires "librtd"
│   ├── librtd.py # import nimporter, from librtd import [wrapper functions]
│   └── setup.py # ext_modules as above
├── test.fasta
└── tests
    ├── config.nims
    └── tests.nim

In doing so, I got an error when running nimporter compile from the librtd-py directory:

~/D/P/R/l/librtd-py ❯❯❯ nimporter compile
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/__pycache__/cli.so
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/__pycache__/cli.nim.hash
Building Extension Mod: cli.nim
Building Extension Lib: librtd
Traceback (most recent call last):
  File "/Users/BenjaminLee/.virtualenvs/librtd/bin/nimporter", line 8, in <module>
    sys.exit(main())
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter_cli.py", line 203, in main
    library=is_lib
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter.py", line 564, in compile_nim_code
    raise NimInvokeException(tmp_cwd, nim_args, errors, output)
nimporter.NimInvokeException: Failed to run command: nimble

Current Directory:
    /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtd

Error Message:
     Error: Execution failed with exit code 256
    /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtd/librtd.nim(1, 8) Error: A module cannot import itself

Command Line Arguments:
    nimble
        c
        --opt:speed
        --parallelBuild:0
        --gc:markAndSweep
        --threads:on
        --app:lib
        -d:release
        -d:ssl
        --out:/Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtd/__pycache__/librtd.so
        /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtd/librtd.nim
        --accept

This error looks simple enough to fix: just don't use the same name for the Python wrapper as the library itself. I renamed the wrapper Nim "library" for Python from "librtd" to "librtdpy":

.
├── cli.nim
├── cli_wrapper.py
├── librtd.py
├── librtdpy
│   ├── librtdpy.nim
│   └── librtdpy.nimble
└── setup.py

Everything seemed to work fine from here:

~/D/P/R/librtd ❯❯❯ vfreset librtd
Removing /Users/BenjaminLee/.virtualenvs/librtd
Using base prefix '/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7'
New python executable in /Users/BenjaminLee/.virtualenvs/librtd/bin/python3.7
Also creating executable in /Users/BenjaminLee/.virtualenvs/librtd/bin/python
Installing setuptools, pip, wheel...
done.
~/D/P/R/librtd ❯❯❯ cd librtd-py/
~/D/P/R/l/librtd-py ❯❯❯ pip install nimporter
Collecting nimporter
  Using cached nimporter-1.0.1.post1-py3-none-any.whl (21 kB)
Installing collected packages: nimporter
Successfully installed nimporter-1.0.1
~/D/P/R/l/librtd-py ❯❯❯ nimporter compile
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/MANIFEST.in
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/__pycache__/cli.so
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/__pycache__/cli.nim.hash
Deleted Folder:     /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/nim-extensions
Deleted Folder:     /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/build/temp.macosx-10.15-x86_64-3.7/nim-extensions
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtdpy/__pycache__/librtdpy.so
Deleted:            /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/librtdpy/__pycache__/librtdpy.nim.hash
Building Extension Mod: cli.nim
Building Extension Lib: librtdpy
/Users/BenjaminLee/.nimble/pkgs/nimpy-0.1.0/nimpy.nim(972, 32) Warning: Cannot prove that 'arg1k' is initialized. This will become a compile time error in the future. [ProveInit]
Done.
Built 2 Extensions In 1.971 secs
~/D/P/R/l/librtd-py ❯❯❯ pip install -e .
Obtaining file:///Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py
Processing /Users/BenjaminLee/Library/Caches/pip/wheels/9b/04/dd/7daf4150b6d9b12949298737de9431a324d4b797ffd63f526e/docopt-0.6.2-py2.py3-none-any.whl
Installing collected packages: docopt, librtd
  Running setup.py develop for librtd
Successfully installed docopt-0.6.2 librtd
~/D/P/R/l/librtd-py ❯❯❯ python -c "import librtd; print(librtd.return_time_distribution('ATGCA', 1))"
{'A_mean': 4.0, 'A_std': 0.0}
~/D/P/R/l/librtd-py ❯❯❯ librtd 1 ../test.fasta > /dev/null
Info: Using librtd v0.1 by Benjamin D. Lee. (c) 2020 IQT Labs, LLC.
Warning: Writing data to stdout
[======================================================================================================================================] 100.00%
Success: Analyzed RTD for 1 sequence totaling 7 bp in 0.0 seconds.

This works perfectly except in the main librtd/ directory:

~/D/P/R/l/librtd-py ❯❯❯ librtd 2 ../test.fasta > /dev/null
Info: Using librtd v0.1 by Benjamin D. Lee. (c) 2020 IQT Labs, LLC.
Warning: Writing data to stdout
[======================================================================================================================================] 100.00%
Success: Analyzed RTD for 1 sequence totaling 7 bp in 0.0 seconds.
~/D/P/R/l/librtd-py ❯❯❯ cd ..
~/D/P/R/librtd ❯❯❯ librtd 2 test.fasta > /dev/null
clang: error: no input files
Traceback (most recent call last):
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter.py", line 841, in __validate_spec
    util.module_from_spec(spec)
  File "<frozen importlib._bootstrap>", line 583, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1043, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: dynamic module does not define module export function (PyInit_librtd)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/BenjaminLee/.virtualenvs/librtd/bin/librtd", line 11, in <module>
    load_entry_point('librtd', 'console_scripts', 'librtd')()
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/pkg_resources/__init__.py", line 490, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2862, in load_entry_point
    return ep.load()
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2462, in load
    return self.resolve()
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2468, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/Users/BenjaminLee/Desktop/Python/Research/librtd/librtd-py/cli_wrapper.py", line 21, in <module>
    from librtd import __version__
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 963, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 906, in _find_spec
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter.py", line 1148, in find_spec
    return Nimporter.import_nim_code(fullname, path, library=True)
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter.py", line 819, in import_nim_code
    cls.__validate_spec(spec)
  File "/Users/BenjaminLee/.virtualenvs/librtd/lib/python3.7/site-packages/nimporter.py", line 887, in __validate_spec
    raise NimporterException(error_message) from import_error
nimporter.NimporterException: Error importing /Users/BenjaminLee/Desktop/Python/Research/librtd/librtd/__pycache__/librtd.so
Error Message:

    dynamic module does not define module export function (PyInit_librtd)

Python Version:

    3.7.7 (default, Mar 18 2020, 19:38:56) [Clang 11.0.0 (clang-1100.0.33.17)]

Nim Version:

    Nim Compiler Version 1.2.0 [MacOSX: amd64]

CC Version:

    <Error getting version>

Installed CCs:

    {'clang': PosixPath('/usr/bin/clang'), 'gcc': PosixPath('/usr/bin/gcc')}

Please help improve Nimporter by opening a bug report at: https://github.com/Pebaz/nimporter/issues/new and submit the above information along with your description of the issue.

~/D/P/R/librtd ❯❯❯ cd ..
~/D/P/Research ❯❯❯ librtd 2 librtd/test.fasta
Info: Using librtd v0.1 by Benjamin D. Lee. (c) 2020 IQT Labs, LLC.
Warning: Writing data to stdout
[>......................................................................................................................................] 0.00%{"id": "test", "GC_mean": 3.0, "CA_mean": 4.0, "CA_std": 0.0, "GC_std": 0.0}
[======================================================================================================================================] 100.00%
Success: Analyzed RTD for 1 sequence totaling 7 bp in 0.0 seconds.

All tests are now passing on Ubuntu, Windows, and MacOS. My only remaining question is: do you have any ideas what caused that error? It's not a big deal to me if it's caused by a name collision, just want to make sure there's nothing majorly wrong with what I did.

Thanks again for all of your help – this is a wonderful project!

Pebaz commented 4 years ago

Awesome! Glad it was able to work at least a little bit!

So I was able to determine exactly why this wasn't working.

Your cli_wrapper.py file imports the librtd.py file on line 21.

When run normally, this works fine.

However, when you ran the CLI in the main directory, there is a folder named: librtd that contains the Nim (not Nimporter) library.

When the CLI is run, librtd is requested to be imported but the librtd.py file is no longer there. Since Nimporter checks the CWD first to favor Nim files, it found the pure Nim library folder librtd instead. The net effect was that the librtd.py that got installed with the Python library was not used. What was even more confusing is that the Nim library compiled perfectly and fit the criterion for a Nimporter library (mentioned above) but when it was imported, it crashed since it was not using nimpy which handles defining the PyInit_<module name> function for you.

So there are a couple solutions to this:

Not sure which one is more convenient but at least there are no code issues at this point just convention.

Thank you for your kind words and it's exciting to see the first finished project using Nimporter!

Benjamin-Lee commented 4 years ago

For reference, changing the name of the Python file to __init__.py is yielding other errors since now Python can't find the module to import when using the Python API. Simply not running in the root folder, while annoying, seems like the best workaround.