Open tomkeee opened 2 years ago
@tomkeee Print operation in every language is slow, if you want to print every line of big file, you have to keep in mind that will drastically slow down your operation. What is a time of execution (processing full file) without printing?
@Tatarinho You are right that print operations are quite slow, yet I tried to do a similar operation on my local machine (code below) and the execution time was 55 sec (on Scramjet it was 25min).
import time
lines_number = 0
with open("/home/sirocco/Pulpit/data.csv") as file_in:
start = time.time()
for i in file_in:
if lines_number <1000:
print(f"{lines_number} \n")
elif lines_number > 2000:
print(f"{lines_number} bigger than 2000 \n")
lines_number += 1
print(f"the line_numbers is {lines_number}\n execution time: {time.time()-start}")
Hi @tomkeee, we'll be looking into this next week.
Hmm... so I did some initial digging and was able to run a test with local network and a similar program in node works quite fast, but not as fast as from the disk...
We need to take into account the network connection, but that wouldn't explain 25 minutes.
Could you follow this guide: https://docs.scramjet.org/platform/self-hosted-installation
Then based on this, can you try your program with the data sent to 127.0.0.1
? We'd exclude the network and the platform configuration as the culprit...
I've tested the platform on how fast it would analyze a big .csv file (532MB, 3 235 282 lines). The execution time of the program (code below) is about 25 minutes.
The program should just print the current line with a very simple comment
main.py
package.json
requirements.txt
scramjet-framework-py