Closed fsoubelet closed 1 year ago
@rdemaria how would you feel about using a dedicated function to access the headers (only) of the file? This would be used as:
from tfs.reader import read_headers
headers = read_headers(your_file.tfs)
ok! I bit more cumbersome than adding an argument, but I noticed that you forbid to have many arguments, so I don't argue...
This PR adresses #114 and #115.
For the first one, as
pandas
naturally infers empty strings (""
) toNaN
when reading, a step is inserted to convert backNaN
values instring
orobject
-type columns into empty strings.For the latter, a new function
read_headers
is added totfs.reader
, which does exactly that.Incidentally, a little internal rework of the reader was done. A new function and dataclass were added that take care of reading metadata of the file: everything but the dataframe part (headers, number of non data lines, column names and types). It is used in the
read_tfs
andread_header
functions (this was mostly a block export fromread_tfs
to the helper_read_metadata
). Thanks to @JoschD for suggesting this.Tests were added. Version is bumped to
3.5.0
.