starfile.read fails when trying to parse a loop block without data that was previously written with starfile.write.
In that case the block header is followed by multiple empty lines, which trip up the parser.
Minimal example to reproduce it
>>> import pandas as pd
>>> import starfile
>>> starfile.write({'block':pd.DataFrame({'col1':[]})},'test.star')
>>> sf=starfile.read('test.star')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/andreas/sft/starfile/src/starfile/functions.py", line 43, in read
parser = StarParser(filename, n_blocks_to_read=read_n_blocks, parse_as_string=parse_as_string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andreas/sft/starfile/src/starfile/parser.py", line 48, in __init__
self.parse_file()
File "/home/andreas/sft/starfile/src/starfile/parser.py", line 60, in parse_file
block_name, block = self._parse_data_block()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andreas/sft/starfile/src/starfile/parser.py", line 74, in _parse_data_block
return block_name, self._parse_loop_block()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andreas/sft/starfile/src/starfile/parser.py", line 124, in _parse_loop_block
df = pd.read_csv(
^^^^^^^^^^^^
File "/usr/lib64/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
self._engine = self._make_engine(f, self.engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
return mapping[engine](f, **self.options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
self._reader = parsers.TextReader(src, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
Possible fix
As small change in the parser can fix the issue.
Instead of checking whether loop_data is equal to line feed I just checked whether loop_data starts with a line feed.
In addition, I added a line to set the column headers even for empty loop block-
diff --git a/src/starfile/parser.py b/src/starfile/parser.py
index 0febed0..6b905c1 100644
--- a/src/starfile/parser.py
+++ b/src/starfile/parser.py
@@ -116,9 +116,10 @@ class StarParser:
loop_data += '\n'
# put string data into a dataframe
- if loop_data == '\n':
+ if loop_data.startswith('\n'):
n_cols = len(loop_column_names)
df = pd.DataFrame(np.zeros(shape=(0, n_cols)))
+ df.columns = loop_column_names
else:
column_name_to_index = {col: idx for idx, col in enumerate(loop_column_names)}
df = pd.read_csv(
With these changes the parser reads the star file fine.
>>> import pandas as pd
>>> import starfile
>>> starfile.write({'block':pd.DataFrame({'col1':[]})},'test.star')
>>> sf=starfile.read('test.star')
>>> sf
Empty DataFrame
Columns: [col1]
Index: []
Description
starfile.read fails when trying to parse a loop block without data that was previously written with starfile.write. In that case the block header is followed by multiple empty lines, which trip up the parser.
Minimal example to reproduce it
Possible fix
As small change in the parser can fix the issue. Instead of checking whether loop_data is equal to line feed I just checked whether loop_data starts with a line feed. In addition, I added a line to set the column headers even for empty loop block-
With these changes the parser reads the star file fine.