FreeOpcUa / python-opcua

LGPL Pure Python OPC-UA Client and Server
http://freeopcua.github.io/
GNU Lesser General Public License v3.0
1.35k stars 659 forks source link

Optimization of writing data to csv file #870

Open avieini opened 5 years ago

avieini commented 5 years ago

Hi, I'm trying to write data from Kepserver server (V6) Simulator to a csv file: I have predefined list of tags I want to read each second. I tried split it to different to separate proccesses where each process has it own file but when I work with large-scale of tags (say 5000 for each process) I found out that I have lag between to consecutive request from the server (either a problem from the server side or problem caused by my code which cannot write so many tag so fast) I used the basic handler in the examples and I suspect that the problem is that I write each tag separately instead of write them all at once. I will be very glad if someone can have a look and help to improve it. Thanks!

the code:

import pandas as pd
from opcua import Client
import time
import datetime
import threading
import platform
import multiprocessing

class SubHandler(object):

##Writing the tag values to file - name,value,timestamp
    def datachange_notification(self, node, val, data):
        global output_file
        output_file.write("{},{},{}\n".format(node.nodeid.Identifier, val,data.monitored_item.Value.SourceTimestamp))

def run_writing(end_file,tags):
    station_name = platform.node()
    client = Client("opc.tcp://127.0.0.1:49320")
    client.connect()

##Creating file 
    filename = datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
    global output_file
    output_file= open(r"D:\Data\{}_{}_{}.csv".format(station_name,end_file,filename), 'a+')
    output_file.write("tag,value,timestamp\n")

##Get the node names from the raw names
    processed_tags = []
    for tag in tags:
        processed_tags.append(client.get_node("ns=2;s={}".format(tag)))

##Defining the subscription 
    handler = SubHandler()
    sub = client.create_subscription(1000, handler)
    handle = sub.subscribe_data_change(processed_tags)

##Scheduling of writing file every minute 
    while True:
        if datetime.datetime.now().second==59:
            time.sleep(1)
            print ("open new file")
            t = open(r"D:\Data\{}_{}_{}.csv".format(station_name,end_file,datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S")), 'a+')
            t.write("tag,value,timestamp\n")
            output_file= t

if __name__ == "__main__":
## Reading the predefined nodes
    df = pd.read_csv(r"C:\Users\User\Desktop\D3.csv")
    df['tag'] = "SIM.D1." + df["Tag Name"]
    tags = df['tag'].values.tolist()

    df = pd.read_csv(r"C:\Users\User\Desktop\D4.csv")
    df['tag'] = "SIM2.D1." + df["Tag Name"]
    tags_2 = df['tag'].values.tolist()

    tags = tags[:10000]+tags_2[:10000]
####

## Creating separate proccesses for each group of tags
    t1 = multiprocessing.Process(target=run_writing, args=("part_1",tags[:5000],))
    t2 = multiprocessing.Process(target=run_writing, args=("part_2",tags[5000:10000],))
    t3 = multiprocessing.Process(target=run_writing, args=("part_3",tags[10000:15000],))
    t4 = multiprocessing.Process(target=run_writing, args=("part_4",tags[15000:],))
    # starting thread 1
    t1.start()
    # starting thread 2
    time.sleep(0.111)
    t2.start()
    time.sleep(0.111)
    t3.start()

    time.sleep(0.111)
    t4.start()
oroulet commented 5 years ago

I donnot understand much of your code. but you should definitely store all you data in a variable and write them at once in another thread. maybe another process

zerox1212 commented 5 years ago

Maybe have datachange_notification just save the value to a dict with the node name being the key. Then every 1 second just write the dict to CSV.

Either way I would not write directly to file in datachange_notification on every subscription update.

avieini commented 5 years ago

Hi, Thanks for your answers! I tried save to a variable and write it all at once, but then I have lags in the streaming while the program is writing the variable to a file. Do you have any other suggestions?