LLNL / PyDV

PyDV: Python Data Visualizer
Other
13 stars 6 forks source link

Reading From .ultra Too Slow #267

Closed fillmore1 closed 2 years ago

fillmore1 commented 2 years ago

The function pydvpy.read() is very slow. It takes over a second to parse a 200,000 line file.

I believe there is room for improvement. Profiling shows that a significant amount of time is used in len, str.strip, str.split, str.lower, and list.append.

There is also a potential resource leak if an exception is raised during parsing. The file f will not be closed. This can be fixed by using a context manager. (with open(fname, 'r') as f:)

fillmore1 commented 2 years ago

I messed up and based my branch on the updated code from #258. We can wait for that branch to merge or someone else can apply my changes. Sorry about that.

rusu24edward commented 2 years ago

Hi @fillmore1, I kept your module changes in a different branch since they are in line with a larger effort in PyDV to refactor the modules. If you want to include only the changes you made to improve the reader, you can create a new branch off master and cherry-pick your commits, like so

git checkout -b new-branch-off-master
git cherry-pick 48519e5dd5ffc04f0018feb9250647dd3e55cda5
git cherry-pick 6747cfcb22fbf9c332a4cb03a9fe8d4edf9a4014
git cherry-pick cc9b898faeafeb22e7a0b522c2b3e39a4939ec31
git cherry-pick dd30cb558a5c618fb664afc83078022b94b027de
git cherry-pick 0862054bc62bd7a52d5118a7385a2d43e0ba2ec8
git cherry-pick 952a45095a43673779a48a95a08c94ffcc5372ed

Then submit a PR for that, and it should have only the changes related to the reader.