Open borondics opened 3 years ago
I discovered that np.loadtxt
has an advantage at small file sizes and pd.read_csv
is faster for large ones. In other words, the loading time for NumPy is linear with the file size while it is not with Pandas.
The crossover is around 1MB, so this brings an interesting question as individual files are usually below the MB limit. If we want to speed up large files we definitely should switch, but this would set us back when loading series of small files with Multifile
. I still need to test what this would mean for us.
The figure below is done in pure python, not through the Quasar loaders. Then we get some overhead, which would be also interesting to investigate and decrease.
We use numpy.loadtxt in a lot of places and there are faster solutions. Pandas for example can be significantly faster. @markotoplak, @stuart-cls do you think we should switch over to Pandas when loading the data?
The file in this case was ~190 MB, which is a normal FPA image.