OCA / connector-interfaces

Odoo Generic Connector for ODBC, .CSV,...
GNU Affero General Public License v3.0
59 stars 152 forks source link

Modules to handle files, decisions on their inclusion #5

Closed guewen closed 2 years ago

guewen commented 9 years ago

Several modules with the same purpose have been proposed for inclusion in the OCA.

The goal of these modules is to provide facilities to handle files, their transfer (ftp, email, ...) and their import / export.

Only one of them should be selected, then remaining features of the others could be proposed in the selected module.

All of them work with "chunks" of documents, they split the documents in chunks that can be imported using the jobs queue of the connector.

Their main features are:

Please let's all be honest and recognize the best of breed in each of these modules.

In my opinion, the most important thing is how is implemented the ground of the module, and in this idea, the connector_flow seems to have the better one with the tasks flow (but I surely miss some points).

I am probably missing a lot of things, so can I ask to the different authors to complete the information (@tremlin, @OSguard, @sebastienbeau, @lepistone, @nbessi)?

OSguard commented 9 years ago

as awnser to https://github.com/OCA/connector-interfaces/pull/3#issuecomment-62737334 we can share example how to make a custom interface for a specific flow, to remove option like you have on the generic wizard, that looks like this: start task

OSguard commented 9 years ago

i was looking in the connector-file module So at the moment it is limit to import account.move as is ref in _file_chunkbinding model It used the connector.backend to hold the connection data to the ftp-data. But this is also not generic if you have different backend types. Also use bindings, what we not support in connector_flow. But this is a future extension for connector_flow to use bindings to have the handling if a object should create or updated by import is made by framework and not in your task. I guess we should start a table to have a better overview:

connector_flow File Exchange Connector File
Chunk Class yes yes yes
Bindings no no yes
Backends no no yes
Basic classes task + task.flow multi task + file.document.condition + file.repository no
File Model impexp.file as metadata to ir.attachment file.document ir.attachment with binding
FTP import yes yes yes
FTP export yes yes no
Email export easy ? no
Email import ? yes no
File to local easy yes no
use for any oobject yes yes no?
how may cron need for ftp download + chung + import 1 1 3
interact with ir.cron no yes no

maybe we can share on ethercalc: https://www.ethercalc.org/xlv2wb11vj (i want to try this tool)

nbessi commented 9 years ago

I do not know the the other module but connector_file provide foundation to manage any file based import. The import of account move is only one implementation.

It uses a provider and policy approach and provide a standard way to manage complex action over sucess or failure.

It is also design with transformation and performance in head and provide a solid feedback to end user and be able to correct data source.

It looks like that the connector_file module can probably be merged with the connector_flow or file_exchage modules and be used as a core engine.

@lepistone who made most of the code should probably give more feedback

nbessi commented 9 years ago

I want also to notice an important point the connector_file module is heavily testes and include a solid unit test coverage. This is an important factor to take in account when accepting project in OCA.

I also would like to have opinion of @gracinet who has dug the problematical and maybe has already evaluate the three solutions.

lepistone commented 9 years ago

Thanks for your work dealing with that!

I follow on @nbessi's comments about connector-file: it was designed to with the three parts (get, parse, load) completely isolated and handled by three separate crons and backends, with the idea that they are somewhat orthogonal.

Another requirement we had was the ability to import a huge file where only one line has a problem (because, say, debit or credit are negative). In that case the user can extract from the interface just the lines of the source file that gave the problem and handle those separately.

It is of course true that the module as is handles specific cases (even though it is designed to be modified easily).

I haven't been working on connector-file for a while because it is in production for the need we had, and I am now more involved in other projects. There is of course no problem for me whatever project is chosen (they look great!) or if some parts of the modules are merged.

FYI on https://github.com/camptocamp/connector-file/pulls there is a substantial and very high-quality PR from @gracinet and a smaller one by @sbidoul that still wait my review. Sorry for that! I hope to review those soon.

I also agree @gracinet is the last one to work substantially on the module, so I am interested on his input here.

gracinet commented 9 years ago

Thanks for inclusion @nbessi. The work I've done in connector_file (additionnal backend for bank statements) was already constrained to that context before I started, so I didn't evaluate the two other solutions (I can take a look).

I concur with you that tests and quality in general are essential to me (not saying anything about the 2 other solutions). What I can say for connector_file is that it's robust and client satisfaction with it is high. Notably, failed tasks (chunks) are easy to track for non technical staff, while still carrying enough information for technical staff to investigate the cause of failure. Also, the ability to have special treatment based on import type is paramount to me (don't know what the other ones provide in that department). Relying purely on the CSV import feature of Odoo is definitely not enough.

My top list for improvement:

sbidoul commented 9 years ago

@gracinet can you check if #3 fits your requirements for generic CSV import? I believe it also purges old ir.attachments when old jobs are purged.

OSguard commented 9 years ago

@nbessi we have no unit test in connector_flow but it is also battle test, we have it up in 4 projects including import of csv file with 500k of lines ;) @lepistone and yes you can edit one line from the interface as well ;)

So we take the test from connector-file and make them work with connector_flow and for the next c2c project they can add feature they need like hash.

tremlin commented 9 years ago

One of the outstanding properties of connector_file is that it uses a lot of features of the connector like backends and synchronizers. I don't know if this level of abstraction is necessary for such "simple" tasks as CSV import/export. IMHO this creates a barrier in a basic framework for developers (like me) that haven't been much into the connector as a whole. One of the design goals of connector_flow was to be as minimal and flexible as possible. Since I was involved in the development, I cannot say if we reached this goal by using only a small part of the connector framework or if the new names and concepts pose a potential barrier, too.

nbessi commented 9 years ago

@tremlin has raised a point here. If we look the repository description:

This repository provides various projects using the Odoo Connector Framework for generic purposes, such as importing from data files or from external databases. 

The repository seems to be dedicated to Connector Framework derived addons.

OSguard commented 9 years ago

@lepistone and @sebastienbeau it would be glade if you look deeper in connector_flow and give a signal if this is a solution that you can use in your next project (maybe adding some missing feature) or your stick on your current solution. From @tremlin and my side. We have look in your solution and then adapted the ideas and make a module for us is more simple and more flexible. So we will stick on connector_flow and maintain it, inside or outside the OCA, because it is a solution fit our needs.

OSguard commented 9 years ago

Now we have a detail description in english of connector_flow http://blog.initos.com/2014/12/09/robust-data-importexport-in-odoo-using-connector_flow-as-a-basis/

Viggor commented 9 years ago

Dear contributors

We have a case where we need to import file from “La Poste”, in that case the file is too large to be handle by the ORM, so we use a SQL request to inject it. Also for banking import some parser was already made in various module in OCA. So we don’t need a system with a parsing method we just need to manage file. We also have a very simple case of export: “Colissimo edi” (Guewen is working on it) so we need a very simple FTP export to do that.

I think we can use file_exchange as file manager to provide files queue and backends. We can made a new python library to wrap all protocol available such as ftp/sftp/filestore and to simplifying the use and the extension to other protocol. Then when the file is ready to be processed we can use multiple method to parse and load, for example in simple case we can use banking import or SQL request and for more complex case we can use connector_file or connector_flow which will depend of exchange_file to take and proceed the files provide by the queue.

For this new architecture we only need a banckend and an object to store document (inherited from ir.attachment) and two cron job (one for sync files and one for trigger the file processing). We will be inspired by connector_file and file_exchange to do this. After this we can refactor connector_file or connector_flow to make them useable with this new version of file_exchange.

I will work on this task to make a PR which can be merged in OCA.

What do you think of that proposal?

sebastienbeau commented 9 years ago

Note : we will not depend of connector.

OSguard commented 9 years ago

My opinion:

guewen commented 9 years ago

On 02/02/2015 01:19 PM, Viggor wrote:

We also have a very simple case of export: “Colissimo edi” (Guewen is working on it) so we need a very simple FTP export to do that.

I was not aware of that, on what should I be supposed to work? :o)

guewen commented 9 years ago

one module "ir.attachment.advance" extends the ir.attachment with more metadata and usefull function for exchange (like hashing) should independent from connector connector_flow and other modules can benefit from that to share this one type instead having one python lib or odoo module for handling file location (ftp, sftp, apis, local) is only a more api to maintain. Ftp on python are 2-3 lines. I see no benefit comparing to overhead.

Agree

sebastienbeau commented 9 years ago

Hi, I really like the idea of "ir.attachment.advanced". How we should name the module "ir_attachement_metadata"?

Regarding the lib, it's not an important point for me. The important point is more having the lighter odoo as possible to "exchange file". This module should have only 1 backend object and 2 crons :

An alternative (for the cron) (maybe it's better) will be :

What do you think?

sebastienbeau commented 9 years ago

Note : The backend do not depend of connector. It only depend of "ir_attachement_metadata"

OSguard commented 9 years ago

for "ir.attachment.advanced" the module name "ir_attachement_metadata" sounds good.

"backend" is a topic from connector framework, so i suggest: 1) rename it to "external.file.location" (or some other) if it is unrelated to connector or 2) use it as connector.backend with connector together. So your backend is a remote location to im/export with the connector, and we can 'bindings' as well, to see "this product was import via csv from this backend (file.location)"

use connector or not is a principle decision. I see connector.jobs more as ir.cron 2.0 becouse it gives you more controll, you see errors and can rerun and so one. The only disadvantage is you need to install the connector module and configured. But if you have it ir.cron are only to queue new jobs but not executing real function.

sebastienbeau commented 9 years ago

Hi Markus. Cool we will start the module ir_attachement_metadata and make a PR of it.

For me it's ok to use "external.file.location" instead of backend as we do not depend of connector.

Regarding the management of error and the binding, I think it will depend of the need.

If you need to have a synchronisation of product with an external system using a file system, you can inherit the "external.file.location" with a "backend.external.file.location" in order to depend of the connector and the binding feature and use the job.

In the case of importing bank statement you just need a simple cron that will take the "ir.attachment.advanced" and push it into the bank-statement parser (we have it in production since 2 years and not big trouble, one file per day easy to follow the error).

For the name of the module, what name do you prefer?

Regarding connector_flow do you think that it will be easy for you to integrate this solution as dependency for supporting FTP, SFTP, filestore, S3...? Does there is some point that we should take care to make the integration easier?

Thanks

OSguard commented 9 years ago

"external_file_location" +1

regarding connector_flow:

sebastienbeau commented 9 years ago

Hi after think twice, here is some conclusion.

The module "external_file_location" will add two model "external.file.location" and "external.file.location.taxk"

On one "external.file.location" we will configure the type of location (ftp, sftp...) and the parameter (login , password, host...). Then on each "external.file.location" we can add a task (import, export, custom_import....) and fill some parameter (path, filename...). Basically the task object will be really similar to the actual task in connector_flow (we will include most of the code/feature of ftp_upload, ftp_download....), I think it will be easy to integrate with connector_flow.

We will have two cron:

This module will not depend of connector.

We will propose a merge soon.

Now the last question is where we put the module "ir_attachement_metadata" and the module "external.file.location" ?

OSguard commented 9 years ago

A) here, pro: it relates to the other addon cons: by it self it is not depending on connector B) https://github.com/OCA/knowledge as it deals with documents (aka ir.attachment) C) new one I think B is better then C. A is also okay. I vote for B

sebastienbeau commented 9 years ago

I vote for B too

OSguard commented 9 years ago

We make a PR for connector_flow for version 8.0 #17

Viggor commented 9 years ago

@OSGuard @tremlin @codingforfun @guewen what do you think of the module i submitted to OCA/knowledge? It's a generic module to communicate with some protocol, it's easy to extend with new protocol through a generic python lib. It make only communication between local and remote location. It can create attachments from remote files or remote files from attachments. https://github.com/OCA/knowledge/pull/45

yvaucher commented 9 years ago

For the follow up @Viggor PR on has been moved to https://github.com/initOS/connector-interfaces/pull/1

bodi000 commented 9 years ago

Hello and sorry fon not being really on subject, but I have a hard time finding a connector for 6.1 or tha can be backported to 6.1 easily. I need to perform two tasks, get files regularly from an external server (sftp) and then import the files into the system.

Is there any living project for 6.1?

Thank you, B

Viggor commented 9 years ago

Actually i don't know if there a similar project for 6.1, but i will ask my coworker for further informations and come back to you.

Viggor commented 9 years ago

@bodi000 Maybe you can see to https://code.launchpad.net/~akretion-team/file-exchange/file-exchange-6.1 it will probably answer to your needs.

paulius-sladkevicius commented 8 years ago

I'm curious If any decision was made after all discussion? I'm searching for a good tool that gives possibility to add import job to add any file + transform it (e.g. https://petl.readthedocs.io/en/latest/) to Odoo friendly format and load.

bealdav commented 8 years ago

here is the first step merged: https://github.com/OCA/server-tools/tree/8.0/attachment_metadata, the second is there https://github.com/OCA/server-tools/pull/377.

Here is implementation of full flows https://github.com/OCA/connector-interfaces/pull/17 with file transformation and connector

And the final step is there https://github.com/initOS/connector-interfaces/pull/1 to link basic exchanges files with more advanced use.

Keep on mind: split features in small modules and add unit tests

petl has a place to transform data.

github-actions[bot] commented 2 years ago

There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. If you want this issue to never become stale, please ask a PSC member to apply the "no stale" label.