Closed achimstruve closed 1 year ago
My colleague @Scott-Canning mentioned that we could also leverage a SQLlite3 database as it is a straight forward approach and I am quite confident that it is compatible with pandas data frames.
https://docs.python.org/3/library/sqlite3.html
cc.: @BlockBoy32
Sharing a simple example for creating a sqlite3 database file (independent script file to be run once or for overwriting) and an insertion function:
import sqlite3
# Connect to the database
conn = sqlite3.connect('tweets_auto.db')
# Create a table to store JSON data
cursor = conn.cursor()
# Create table
cursor.execute('''CREATE TABLE tweets
(id TEXT PRIMARY KEY,
author_id TEXT,
username TEXT,
created_at TEXT,
impression_count INTEGER,
like_count INTEGER,
quote_count INTEGER,
reply_count INTEGER,
retweet_count INTEGER,
text TEXT)''')
# Commit the changes and close the connection
conn.commit()
conn.close()`
def load_user_tweets(user_tweets):
# Connect to the database
conn = sqlite3.connect('tweets_auto.db')
cursor = conn.cursor()
# Pull username
username = user_tweets['includes']['users'][0]['username']
# Insert data
for tweet in user_tweets['data']:
try:
cursor.execute("INSERT INTO tweets VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
(
tweet['id'],
tweet['author_id'],
username,
tweet['created_at'],
tweet['public_metrics']['impression_count'],
tweet['public_metrics']['like_count'],
tweet['public_metrics']['quote_count'],
tweet['public_metrics']['reply_count'],
tweet['public_metrics']['retweet_count'],
tweet['text']
)
)
except Exception as error:
# Handle any errors that occur during data insertion
print(f"Error upon load: {error}")
conn.rollback()
else:
# Commit the changes to the database if no errors occurred
conn.commit()
# Close the database connection
conn.close()
@Scott-Canning Thanks for the advice, @achimstruve I implemented a POC with all of the data flowing properly. It seems to still be very slow on the interface but I cannot pin why. I can dig into it more when I have the cycles, cheers.
Also @achimstruve I am going to leave this open for now because while we solved the core issue I want a reminder to do performance improvements specifically for the interface interactions.
@achimstruve Hey could you give a swing at this? For some reason mine is still really slow and I can't seem to identify why
Now I did some performance tests in our streamlit interface with the following results:
Run empty script: <1s Simulation only: 33s Simulation + Postprocessing: 43s
This clearly shows that the calculation work is somehow done in streamlit, which makes it quite slow compared to the pure Python execution.
This streamlit blog talks about different tips to improve the streamlit performance.
I will close this issue as it is solved and open a new one dedicated to the UI performance.
The new one is #47.
Who knows how we can retrieve the post processed data from a subprocess ran script?
In line 45 of
./Model/interface.py
we call the subprocess to run the simulation script.However, how can we for example access the data returned by the
postprocessing
function in row 68./Model/simulation.py
inside of theinterface.py
script to have it available for future plots on the website?