chrthomsen / pygrametl

Official repository for pygrametl - ETL programming in Python
http://pygrametl.org
BSD 2-Clause "Simplified" License
289 stars 41 forks source link

The global variable _alltables is difficult to handle independent tasks when multi-threaded #28

Closed qianxuanyon closed 3 years ago

qianxuanyon commented 3 years ago

Hello, I have been using it for a while and found it difficult to synchronize multiple independent tasks in parallel Because the existence of global _alltables will cause a task to be submitted, and irrelevant tasks will also be triggered to submit, and errors will also cause the interruption of other tasks.

res_obj = SQLSource(connection = source_conn, query = sql)
tag_onj = BatchFactTable(name = tag_table,keyrefs = keyrefs,measures = tag_cols,targetconnection = dw_conn_wrapper)

def sync(self,res_obj,tag_obj,dw_conn_wrapper,batchsize=50000,toal_num=None,update = True):
        """sync"""
        for row in res_obj:
            tag_obj.insert(row)
        dw_conn_wrapper.commit()

task = [('source_table','tag_table'),('source_table1','tag_table2'),('source_table3','tag_table4')]

Similar code I have multiple tasks that need to be synchronized

table_x --> table_y table_x1 --> table_y1 table_x2 --> table_y2

Because the existence of the global object _alltables will affect the concurrency of multiple independent tasks. The synchronization task in task is simply to synchronize the table from one database to another database

When using multithreading, the objects in _alltables will be difficult to handle independently

Can you consider changing _alltables to internal objects instead of global objects

chrthomsen commented 3 years ago

Hello,

Thanks for sharing.

I don't really understand what the problem is. Could you please share a more elaborated example? If you want to avoid that endload() is called for all BatchFactTables, you could call it yourself on the table you want to and then commit via your PEP249 connection instead of the ConnectionWrapper.

Best regards, Christian Thomsen

qianxuanyon commented 3 years ago

I see. Thanks for the tip