wimleers / fileconveyor

File Conveyor is a daemon written in Python to detect, process and sync files. In particular, it's designed to sync files to CDNs. Amazon S3 and Rackspace Cloud Files, as well as any Origin Pull or (S)FTP Push CDN, are supported. Originally written for my bachelor thesis at Hasselt University in Belgium.
https://wimleers.com/fileconveyor
The Unlicense
341 stars 95 forks source link

Packaging File Conveyor as a Python Egg #81

Closed benoitbryon closed 12 years ago

benoitbryon commented 13 years ago

Hi,

First of all, thanks for this software! It looks really great!

As a python developer, I felt confused when I tried fileconveyor:

So, here is a first pull request with a packaging proposal...

Changes description

With these changes, I could:

pip install -e git+https://github.com/benoitbryon/fileconveyor@packaging-egg#egg=fileconveyor
[buildout]
find-links = https://github.com/benoitbryon/fileconveyor/tarball/packaging-egg#egg=fileconveyor-0.1-dev
parts =
    python
eggs =
    cssutils
    pyinotify
    paramiko
    fileconveyor
[python]
recipe = zc.recipe.egg
interpreter = python
eggs = ${buildout:eggs}

With these changes, you should be able to release your work on Pypi! A good start to promote your project in the Python community. Maybe you will get additional contributors ;)

Tests

I successfully used the modified version of fileconveyor on a Linux system, but I did not tried every processor or transporter. So I cannot say "it works!" but only "it worked for me".

Have you a test procedure that I should follow the next time I want to do a pull request?

wimleers commented 13 years ago

First of all: WOW. THIS IS AWESOME!

(And sorry for shouting — I'm just very excited!)

wimleers commented 13 years ago

Okay, some background info.

I knew what I wanted to write for my bachelor thesis. I knew what it had to be capable of. But I didn't want to write it in a particular language. Some of my classmates were extremely excited about Python back then (and still are, AFAIK) — they claimed there was "a Python module for everything". So I chose Python, despite never having used Python before (hence some hacks and unpythonesque things, the lack of proper Python packaging, etc.).

Only to be disappointed to find out that for all the crucial things I needed to do, there were no Python modules. That's why I wrote FSMonitor. But it was actually extremely hard to find some Python module that provided file system storage abstraction, protocol abstraction, or whatever you call it. Eventually, I stumbled upon django-storages and started using that. I contributed several patches to it, making it more stable.

It's always been my intention to contribute to django-storages from the File Conveyor project (i.e. upstream), but also to get new functionality from the developers of django-storage (downstream). It'd be a win-win situation. But, unfortunately, virtually nobody used File Conveyor for quite some time. It seems it's gaining some traction* at last! (See the exciting announcement at http://fileconveyor.org/ — more details to come later!)


Your changes.


Expected results: installing with pip and possibly getting additional contributors: HURRAY!


Tests procedure: not really. I do have unit tests, but that's it. These unit tests are currently written per Python module (since I wrote this module per module, until each module was suitably tested and pretty much bug-free) and don't come with a project-wide testrunner yet, that runs all individual unit tests. That's another thing that needs to be added :)


Conclusion:

  1. You rock! Thank you for your contribution, sir.
  2. You're clearly a Pythonista and work with it far more regularly. Which makes this contribution all the more valuable!
  3. Would you like to get commit rights? :)
  4. I definitely want to commit this, but could you first fix my only objection (support full Python module paths, but also support just module names if they ship with File Conveyor).
  5. Documentation would need to be updated, but I'll take that on me. If you could just fix point 4, that'd be great :)
wimleers commented 13 years ago

An issue has been created for the test runner thingie: #83.

benoitbryon commented 13 years ago

Would you like to get commit rights?

I suggest that I do additional pull requests before being granted commit rights. So that we discuss, then share some vision on the project. I feel that my own vision of the project is too restricted right now.

If you think it's better to give me commit rights now, do it. I won't be using them on master at first (I will work on branches)... Maybe later with your approval.

benoitbryon commented 13 years ago

could you first fix my only objection (support full Python module paths, but also support just module names if they ship with File Conveyor)

Agreed. I guess I can implement it with some "try-except ImportError" block.

benoitbryon commented 13 years ago

The commits above are about your remark: support full Python module paths, but also support just module names if they ship with File Conveyor:

Right now, the error message for external processors or transporters is not perfect. The scenario is:

wimleers commented 13 years ago

I'm leaving on a vacation tomorrow morning, so it may be a while until I get back to you on this (j'ai vu que tu es un Français — nous allons en vacances à Nice!) ASAP, I promise. Considering the preparations I still have to make and the arrangements for my housing in Palo Alto (I'm interning at Facebook), I really can't spend any of my time reviewing your changes right now.

Thanks for your patience and understanding!

benoitbryon commented 13 years ago

Warning: currently: catching import errors during processor and transporter loading can hide import errors in external libraries.

Synopsis:

It may not be a blocker issue.

wimleers commented 13 years ago

I'm afraid I can't get it to work:

--( ~/Work/fc-benoit/fileconveyor (packaging-egg) )-- python arbitrator.py 
Traceback (most recent call last):
  File "arbitrator.py", line 23, in <module>
    from fileconveyor.settings import *
ImportError: No module named fileconveyor.settings

or

--( ~/Work/fc-benoit (packaging-egg) )-- python fileconveyor/arbitrator.py 
Traceback (most recent call last):
  File "fileconveyor/arbitrator.py", line 23, in <module>
    from fileconveyor.settings import *
ImportError: No module named fileconveyor.settings

Basically, it seems like you can't refer to a Python package itself by its name from within the package. I.e. the following code:

from fileconveyor.settings import *
from fileconveyor.config import *
from fileconveyor.persistent_queue import *
from fileconveyor.persistent_list import *
from fileconveyor.fsmonitor import *
from fileconveyor.filter import *
from fileconveyor.processors.processor import *
from fileconveyor.transporters.transporter import *
from fileconveyor.daemon_thread_runner import *

should be changed to

from settings import *
from config import *
from persistent_queue import *
from persistent_list import *
from fsmonitor import *
from filter import *
from processors.processor import *
from transporters.transporter import *
from daemon_thread_runner import *

A similar change is needed at the importing of Transporter classes:

`defaultprefix = 'fileconveyor.transporters.transporter'``

->

default_prefix = 'transporters.transporter_'

Maybe this is just my system, maybe something is wrong with my Python installation (which I doubt), or maybe I'm just doing something stupid (if I ever was something close to a Python expert, then that's definitely no longer the case).

Please tell me what a fool I'm being and where I'm making a super obvious, moronic mistake…

benoitbryon commented 13 years ago

It looks like "fileconveyor" module is not in your sys.path when you launch arbitrator.py. How did you install the fileconveyor package?

wimleers commented 13 years ago

I didn't install the package. I simply worked with my existing FileConveyor git code. Can't we make sure that continues to work as well? If I want to work on code, I want to check out code and start working. This is definitely something we'd need to document better.

How do you handle this? You're working on many different branches, meaning that you obviously need to switch from one instance of FileConveyor to another, meaning that you also have to switch to a different egg?

benoitbryon commented 13 years ago

You're right: we should be able to load fileconveyor's modules without the "fileconveyor" prefix, in a "relative" way: the import statement first checks in the current package, then looks in sys.path.

benoitbryon commented 13 years ago

About installation... several recipe exist. The ones I know are:

The first solution appears to be the simplest... only at first. I began Python with a PHP background and that is what I used to do, it seemed natural. But with little experience (or relevant documentation), you find the alternatives very convenient. I invite you to try virtualenv+pip. It is simpler to understand than buildout, it is a good start, and is enough for most projects.

Yes, creating a Python package requires some additional effort. Here are some advantages:

So, I recommend using fileconveyor as a Python package. On the other hand, I agree it is a good point if it remains simple enough to be used via the old-way "download and run". If we can't do both, I suggest giving a higher value/priority to the package.

benoitbryon commented 13 years ago

About development workflow with a package. Several recipes exist too! Here are some guidelines:

As a conclusion: developing a project is not more difficult when the project is a Python package. In fact, you get additional tools to develop it.