In some situations, researchers would like to build an STM object from existing csv file.
One example can be found in this notebook, where Cell 11 and Cell 14 gives the skeleton of the from_csv. Dask DataFrame is used to handle the large csv in a delayed way. However due to the text nature of the csv file, the chunk size need to be computed before performing lazy functions (as implemented in Cell 11). In Cell 14 there is an implementation of walk through all columns and separate according to the column names.
In some situations, researchers would like to build an STM object from existing csv file.
One example can be found in this notebook, where Cell 11 and Cell 14 gives the skeleton of the
from_csv
. Dask DataFrame is used to handle the large csv in a delayed way. However due to the text nature of the csv file, the chunk size need to be computed before performing lazy functions (as implemented in Cell 11). In Cell 14 there is an implementation of walk through all columns and separate according to the column names.