melonora / napari-cell-gater

A napari plugin for cell marker gating.
BSD 3-Clause "New" or "Revised" License
4 stars 1 forks source link

Load files in plugin #14

Closed sfritzs closed 5 months ago

sfritzs commented 5 months ago

Hi @josenimo and @melonora ,

I am trying to load some files in the plugin and face an issue. I pasted the error below. Thanks for your help.

All the best, Sonja

INFO: Loaded 36 regionprops csvs.

UnicodeDecodeError Traceback (most recent call last) File ~\Documents\Image_Processing\napari-cell-gater\src\cell_gater\widgets\sample_widget.py:135, in SampleWidget._open_sample_dialog(self=) 132 folder = self._dir_dialog() 134 if folder not in {"", None}: --> 135 self._assign_regionprops_to_model(folder) folder = 'Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv' self = <cell_gater.widgets.sample_widget.SampleWidget object at 0x00000225B7906D40> 136 update_open_history(folder)

File ~\Documents\Image_Processing\napari-cell-gater\src\cell_gater\widgets\sample_widget.py:171, in SampleWidget._assign_regionprops_to_model(self=, folder='Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv') 169 def _assign_regionprops_to_model(self, folder: str) -> None: 170 """Read the csv files in the directory and assign the resulting concatenated DataFrame in the DataModel.""" --> 171 self.model.regionprops_df = stack_csv_files(Path(folder)) self = <cell_gater.widgets.sample_widget.SampleWidget object at 0x00000225B7906D40> folder = 'Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv'

File ~\Documents\Image_Processing\napari-cell-gater\src\cell_gater\utils\csv_df.py:30, in stack_csv_files(csv_dir=WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv')) 28 df = pd.DataFrame() 29 for file in csv_files: ---> 30 df_file = pd.read_csv(file) file = WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv') pd = <module 'pandas' from 'C:\Users\sfritzs\AppData\Roaming\Python\Python310\site-packages\pandas\init.py'> 31 df_file["sample_id"] = file.stem 32 df = pd.concat([df, df_file], ignore_index=True)

File ~\AppData\Roaming\Python\Python310\site-packages\pandas\io\parsers\readers.py:948, in read_csv(filepath_or_buffer=WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv'), sep=, delimiter=None, header='infer', names=, index_col=None, usecols=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=, keep_date_col=False, date_parser=, date_format=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, on_bad_lines='error', delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None, dtype_backend=) 935 kwds_defaults = _refine_defaults_read( 936 dialect, 937 delimiter, (...) 944 dtype_backend=dtype_backend, 945 ) 946 kwds.update(kwds_defaults) --> 948 return _read(filepath_or_buffer, kwds) kwds = {'delimiter': ',', 'header': 'infer', 'names': None, 'index_col': None, 'usecols': None, 'dtype': None, 'engine': 'c', 'converters': None, 'true_values': None, 'false_values': None, 'skipinitialspace': False, 'skiprows': None, 'skipfooter': 0, 'nrows': None, 'na_values': None, 'keep_default_na': True, 'na_filter': True, 'verbose': False, 'skip_blank_lines': True, 'parse_dates': False, 'infer_datetime_format': , 'keep_date_col': False, 'date_parser': , 'date_format': None, 'dayfirst': False, 'cache_dates': True, 'iterator': False, 'chunksize': None, 'compression': 'infer', 'thousands': None, 'decimal': '.', 'lineterminator': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'escapechar': None, 'comment': None, 'encoding': None, 'encoding_errors': 'strict', 'dialect': None, 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'delim_whitespace': False, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'dtype_backend': , 'engine_specified': False} filepath_or_buffer = WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv')

File ~\AppData\Roaming\Python\Python310\site-packages\pandas\io\parsers\readers.py:611, in _read(filepath_or_buffer=WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv'), kwds={'cache_dates': True, 'chunksize': None, 'comment': None, 'compression': 'infer', 'converters': None, 'date_format': None, 'date_parser': , 'dayfirst': False, 'decimal': '.', 'delim_whitespace': False, ...}) 608 _validate_names(kwds.get("names", None)) 610 # Create the parser. --> 611 parser = TextFileReader(filepath_or_buffer, **kwds) TextFileReader = <class 'pandas.io.parsers.readers.TextFileReader'> filepath_or_buffer = WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv') kwds = {'delimiter': ',', 'header': 'infer', 'names': None, 'index_col': None, 'usecols': None, 'dtype': None, 'engine': 'c', 'converters': None, 'true_values': None, 'false_values': None, 'skipinitialspace': False, 'skiprows': None, 'skipfooter': 0, 'nrows': None, 'na_values': None, 'keep_default_na': True, 'na_filter': True, 'verbose': False, 'skip_blank_lines': True, 'parse_dates': False, 'infer_datetime_format': , 'keep_date_col': False, 'date_parser': , 'date_format': None, 'dayfirst': False, 'cache_dates': True, 'iterator': False, 'chunksize': None, 'compression': 'infer', 'thousands': None, 'decimal': '.', 'lineterminator': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'escapechar': None, 'comment': None, 'encoding': None, 'encoding_errors': 'strict', 'dialect': None, 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'delim_whitespace': False, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'dtype_backend': , 'engine_specified': False} 613 if chunksize or iterator: 614 return parser

File ~\AppData\Roaming\Python\Python310\site-packages\pandas\io\parsers\readers.py:1448, in TextFileReader.init(self=, f=WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv'), engine='c', **kwds={'cache_dates': True, 'chunksize': None, 'comment': None, 'compression': 'infer', 'converters': None, 'date_format': None, 'date_parser': , 'dayfirst': False, 'decimal': '.', 'delim_whitespace': False, ...}) 1445 self.options["has_index_names"] = kwds["has_index_names"] 1447 self.handles: IOHandles | None = None -> 1448 self._engine = self._make_engine(f, self.engine) self.engine = 'c' self = <pandas.io.parsers.readers.TextFileReader object at 0x00000225B9BA4F10> f = WindowsPath('Y:/Sonjas_Head_and_Neck/P21E05_HN26/cylinter/csv/._1.csv')

File ~\AppData\Roaming\Python\Python310\site-packages\pandas\io\parsers\readers.py:1723, in TextFileReader._make_engine(self=, f=<_io.TextIOWrapper name='Y:\\Sonjas_Head_and_Nec...ylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>, engine='c') 1720 raise ValueError(msg) 1722 try: -> 1723 return mapping[engine](f, **self.options) mapping = {'c': <class 'pandas.io.parsers.c_parser_wrapper.CParserWrapper'>, 'python': <class 'pandas.io.parsers.python_parser.PythonParser'>, 'pyarrow': <class 'pandas.io.parsers.arrow_parser_wrapper.ArrowParserWrapper'>, 'python-fwf': <class 'pandas.io.parsers.python_parser.FixedWidthFieldParser'>} engine = 'c' f = <_io.TextIOWrapper name='Y:\\Sonjas_Head_and_Neck\\P21E05_HN26\\cylinter\\csv\\._1.csv' mode='r' encoding='utf-8'> self = <pandas.io.parsers.readers.TextFileReader object at 0x00000225B9BA4F10> mapping[engine] = <class 'pandas.io.parsers.c_parser_wrapper.CParserWrapper'> self.options = {'delimiter': ',', 'escapechar': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'skipinitialspace': False, 'lineterminator': None, 'header': 0, 'index_col': None, 'names': None, 'skiprows': set(), 'na_values': {'', 'N/A', 'NA', 'None', 'n/a', '-nan', 'NULL', '-NaN', '-1.#IND', '', 'NaN', '-1.#QNAN', '1.#IND', '1.#QNAN', 'nan', '#NA', '#N/A N/A', '#N/A', 'null'}, 'keep_default_na': True, 'true_values': None, 'false_values': None, 'converters': {}, 'dtype': None, 'cache_dates': True, 'thousands': None, 'comment': None, 'decimal': '.', 'parse_dates': False, 'keep_date_col': False, 'dayfirst': False, 'date_parser': , 'date_format': None, 'usecols': None, 'verbose': False, 'encoding': None, 'compression': 'infer', 'skip_blank_lines': True, 'encoding_errors': 'strict', 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'dtype_backend': , 'delim_whitespace': False, 'na_filter': True, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'na_fvalues': set()} 1724 except Exception: 1725 if self.handles is not None:

File ~\AppData\Roaming\Python\Python310\site-packages\pandas\io\parsers\c_parser_wrapper.py:93, in CParserWrapper.init(self=, src=<_io.TextIOWrapper name='Y:\\Sonjas_Head_and_Nec...ylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>, kwds={'allow_leading_cols': True, 'comment': None, 'converters': {}, 'decimal': '.', 'delim_whitespace': False, 'delimiter': ',', 'doublequote': True, 'dtype': None, 'dtype_backend': 'numpy', 'encoding_errors': 'strict', ...}) 90 if kwds["dtype_backend"] == "pyarrow": 91 # Fail here loudly instead of in cython after reading 92 import_optional_dependency("pyarrow") ---> 93 self._reader = parsers.TextReader(src, kwds) kwds = {'delimiter': ',', 'escapechar': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'skipinitialspace': False, 'lineterminator': None, 'header': 0, 'index_col': None, 'names': None, 'skiprows': set(), 'na_values': {'', 'N/A', 'NA', 'None', 'n/a', '-nan', 'NULL', '-NaN', '-1.#IND', '', 'NaN', '-1.#QNAN', '1.#IND', '1.#QNAN', 'nan', '#NA', '#N/A N/A', '#N/A', 'null'}, 'keep_default_na': True, 'true_values': None, 'false_values': None, 'converters': {}, 'dtype': None, 'thousands': None, 'comment': None, 'decimal': '.', 'usecols': None, 'verbose': False, 'skip_blank_lines': True, 'encoding_errors': 'strict', 'on_bad_lines': 0, 'dtype_backend': 'numpy', 'delim_whitespace': False, 'na_filter': True, 'float_precision': None, 'na_fvalues': set(), 'allow_leading_cols': True} self = <pandas.io.parsers.c_parser_wrapper.CParserWrapper object at 0x00000225B9BA4EE0> src = <_io.TextIOWrapper name='Y:\\Sonjas_Head_and_Neck\\P21E05_HN26\\cylinter\\csv\\._1.csv' mode='r' encoding='utf-8'> parsers = <module 'pandas._libs.parsers' from 'C:\Users\sfritzs\AppData\Roaming\Python\Python310\site-packages\pandas\_libs\parsers.cp310-win_amd64.pyd'> 95 self.unnamed_cols = self._reader.unnamed_cols 97 # error: Cannot determine type of 'names'

File parsers.pyx:579, in pandas._libs.parsers.TextReader.cinit()

File parsers.pyx:668, in pandas._libs.parsers.TextReader._get_header()

File parsers.pyx:879, in pandas._libs.parsers.TextReader._tokenize_rows()

File parsers.pyx:890, in pandas._libs.parsers.TextReader._check_tokenize_status()

File parsers.pyx:2050, in pandas._libs.parsers.raise_parser_error()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte

josenimo commented 5 months ago

loaded the same samples locally (they were originally in a server) and it works ...

melonora commented 5 months ago

was it tried to read from a server directly? Because in that case the byte stream is using a different codec then standard utf-8

josenimo commented 5 months ago

yes, was tried directly, is there a way around that, or would the plugin have to be run locally?

melonora commented 5 months ago

I think there is a way, but bit more advanced hehe. Can you put a breakpoint just before the pd.read_csv() to see what the path looks like?

melonora commented 5 months ago

also images etc are read differently from a server so that might be a next thing.

melonora commented 5 months ago

was it intended to directly read from a server?

josenimo commented 5 months ago

well it would be nice, but we can deal with it :)

josenimo commented 5 months ago

will look into debugging mode next week

melonora commented 5 months ago

I would say do it for later and when dealing with large data it requires a completely different io. Because you are sending a bytestream over a connection, if the data is not send in chunks it is not that nice.

sfritzs commented 5 months ago

Hey @melonora, here with @josenimo, we tried loading files locally in our workstation (Win10), and it still gives an error.

Error:


(napari-env) C:\Users\coscia-lab>napari
INFO: Loaded 36 regionprops csvs.
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\cell_gater\widgets\sample_widget.py:135, in SampleWidget._open_sample_dialog(self=<cell_gater.widgets.sample_widget.SampleWidget object>)
    132 folder = self._dir_dialog()
    134 if folder not in {"", None}:
--> 135     self._assign_regionprops_to_model(folder)
        folder = 'C:/Users/coscia-lab/Desktop/cylinter/csv'
        self = <cell_gater.widgets.sample_widget.SampleWidget object at 0x0000021033C29F30>
    136     update_open_history(folder)

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\cell_gater\widgets\sample_widget.py:171, in SampleWidget._assign_regionprops_to_model(self=<cell_gater.widgets.sample_widget.SampleWidget object>, folder='C:/Users/coscia-lab/Desktop/cylinter/csv')
    169 def _assign_regionprops_to_model(self, folder: str) -> None:
    170     """Read the csv files in the directory and assign the resulting concatenated DataFrame in the DataModel."""
--> 171     self.model.regionprops_df = stack_csv_files(Path(folder))
        self = <cell_gater.widgets.sample_widget.SampleWidget object at 0x0000021033C29F30>
        folder = 'C:/Users/coscia-lab/Desktop/cylinter/csv'

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\cell_gater\utils\csv_df.py:30, in stack_csv_files(csv_dir=WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv'))
     28 df = pd.DataFrame()
     29 for file in csv_files:
---> 30     df_file = pd.read_csv(file)
        file = WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv')
        pd = <module 'pandas' from 'C:\\Users\\coscia-lab\\AppData\\Local\\miniforge3\\envs\\napari-env\\lib\\site-packages\\pandas\\__init__.py'>
     31     df_file["sample_id"] = file.stem
     32     df = pd.concat([df, df_file], ignore_index=True)

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\pandas\io\parsers\readers.py:1026, in read_csv(filepath_or_buffer=WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv'), sep=<no_default>, delimiter=None, header='infer', names=<no_default>, index_col=None, usecols=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=<no_default>, keep_date_col=False, date_parser=<no_default>, date_format=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, on_bad_lines='error', delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None, dtype_backend=<no_default>)
   1013 kwds_defaults = _refine_defaults_read(
   1014     dialect,
   1015     delimiter,
   (...)
   1022     dtype_backend=dtype_backend,
   1023 )
   1024 kwds.update(kwds_defaults)
-> 1026 return _read(filepath_or_buffer, kwds)
        kwds = {'delimiter': ',', 'header': 'infer', 'names': None, 'index_col': None, 'usecols': None, 'dtype': None, 'engine': 'c', 'converters': None, 'true_values': None, 'false_values': None, 'skipinitialspace': False, 'skiprows': None, 'skipfooter': 0, 'nrows': None, 'na_values': None, 'keep_default_na': True, 'na_filter': True, 'verbose': False, 'skip_blank_lines': True, 'parse_dates': False, 'infer_datetime_format': <no_default>, 'keep_date_col': False, 'date_parser': <no_default>, 'date_format': None, 'dayfirst': False, 'cache_dates': True, 'iterator': False, 'chunksize': None, 'compression': 'infer', 'thousands': None, 'decimal': '.', 'lineterminator': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'escapechar': None, 'comment': None, 'encoding': None, 'encoding_errors': 'strict', 'dialect': None, 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'delim_whitespace': False, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'dtype_backend': <no_default>, 'engine_specified': False}
        filepath_or_buffer = WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv')

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\pandas\io\parsers\readers.py:620, in _read(filepath_or_buffer=WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv'), kwds={'cache_dates': True, 'chunksize': None, 'comment': None, 'compression': 'infer', 'converters': None, 'date_format': None, 'date_parser': <no_default>, 'dayfirst': False, 'decimal': '.', 'delim_whitespace': False, ...})
    617 _validate_names(kwds.get("names", None))
    619 # Create the parser.
--> 620 parser = TextFileReader(filepath_or_buffer, **kwds)
        TextFileReader = <class 'pandas.io.parsers.readers.TextFileReader'>
        filepath_or_buffer = WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv')
        kwds = {'delimiter': ',', 'header': 'infer', 'names': None, 'index_col': None, 'usecols': None, 'dtype': None, 'engine': 'c', 'converters': None, 'true_values': None, 'false_values': None, 'skipinitialspace': False, 'skiprows': None, 'skipfooter': 0, 'nrows': None, 'na_values': None, 'keep_default_na': True, 'na_filter': True, 'verbose': False, 'skip_blank_lines': True, 'parse_dates': False, 'infer_datetime_format': <no_default>, 'keep_date_col': False, 'date_parser': <no_default>, 'date_format': None, 'dayfirst': False, 'cache_dates': True, 'iterator': False, 'chunksize': None, 'compression': 'infer', 'thousands': None, 'decimal': '.', 'lineterminator': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'escapechar': None, 'comment': None, 'encoding': None, 'encoding_errors': 'strict', 'dialect': None, 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'delim_whitespace': False, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'dtype_backend': <no_default>, 'engine_specified': False}
    622 if chunksize or iterator:
    623     return parser

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\pandas\io\parsers\readers.py:1620, in TextFileReader.__init__(self=<pandas.io.parsers.readers.TextFileReader object>, f=WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv'), engine='c', **kwds={'cache_dates': True, 'chunksize': None, 'comment': None, 'compression': 'infer', 'converters': None, 'date_format': None, 'date_parser': <no_default>, 'dayfirst': False, 'decimal': '.', 'delim_whitespace': False, ...})
   1617     self.options["has_index_names"] = kwds["has_index_names"]
   1619 self.handles: IOHandles | None = None
-> 1620 self._engine = self._make_engine(f, self.engine)
        self.engine = 'c'
        self = <pandas.io.parsers.readers.TextFileReader object at 0x0000021042F018A0>
        f = WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv')

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\pandas\io\parsers\readers.py:1898, in TextFileReader._make_engine(self=<pandas.io.parsers.readers.TextFileReader object>, f=<_io.TextIOWrapper name='C:\\Users\\coscia-lab\\...ylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>, engine='c')
   1895     raise ValueError(msg)
   1897 try:
-> 1898     return mapping[engine](f, **self.options)
        mapping = {'c': <class 'pandas.io.parsers.c_parser_wrapper.CParserWrapper'>, 'python': <class 'pandas.io.parsers.python_parser.PythonParser'>, 'pyarrow': <class 'pandas.io.parsers.arrow_parser_wrapper.ArrowParserWrapper'>, 'python-fwf': <class 'pandas.io.parsers.python_parser.FixedWidthFieldParser'>}
        engine = 'c'
        f = <_io.TextIOWrapper name='C:\\Users\\coscia-lab\\Desktop\\cylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>
        self = <pandas.io.parsers.readers.TextFileReader object at 0x0000021042F018A0>
        mapping[engine] = <class 'pandas.io.parsers.c_parser_wrapper.CParserWrapper'>
        self.options = {'delimiter': ',', 'escapechar': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'skipinitialspace': False, 'lineterminator': None, 'header': 0, 'index_col': None, 'names': None, 'skiprows': set(), 'na_values': {'1.#IND', '', '-1.#IND', '#N/A', '-NaN', 'NaN', '-nan', 'n/a', 'None', 'NULL', '#NA', 'N/A', '-1.#QNAN', '<NA>', '#N/A N/A', 'null', 'nan', '1.#QNAN', 'NA'}, 'keep_default_na': True, 'true_values': None, 'false_values': None, 'converters': {}, 'dtype': None, 'cache_dates': True, 'thousands': None, 'comment': None, 'decimal': '.', 'parse_dates': False, 'keep_date_col': False, 'dayfirst': False, 'date_parser': <no_default>, 'date_format': None, 'usecols': None, 'verbose': False, 'encoding': None, 'compression': 'infer', 'skip_blank_lines': True, 'encoding_errors': 'strict', 'on_bad_lines': <BadLineHandleMethod.ERROR: 0>, 'dtype_backend': <no_default>, 'delim_whitespace': False, 'na_filter': True, 'low_memory': True, 'memory_map': False, 'float_precision': None, 'storage_options': None, 'na_fvalues': set()}
   1899 except Exception:
   1900     if self.handles is not None:

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py:93, in CParserWrapper.__init__(self=<pandas.io.parsers.c_parser_wrapper.CParserWrapper object>, src=<_io.TextIOWrapper name='C:\\Users\\coscia-lab\\...ylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>, **kwds={'allow_leading_cols': True, 'comment': None, 'converters': {}, 'decimal': '.', 'delim_whitespace': False, 'delimiter': ',', 'doublequote': True, 'dtype': None, 'dtype_backend': 'numpy', 'encoding_errors': 'strict', ...})
     90 if kwds["dtype_backend"] == "pyarrow":
     91     # Fail here loudly instead of in cython after reading
     92     import_optional_dependency("pyarrow")
---> 93 self._reader = parsers.TextReader(src, **kwds)
        kwds = {'delimiter': ',', 'escapechar': None, 'quotechar': '"', 'quoting': 0, 'doublequote': True, 'skipinitialspace': False, 'lineterminator': None, 'header': 0, 'index_col': None, 'names': None, 'skiprows': set(), 'na_values': {'1.#IND', '', '-1.#IND', '#N/A', '-NaN', 'NaN', '-nan', 'n/a', 'None', 'NULL', '#NA', 'N/A', '-1.#QNAN', '<NA>', '#N/A N/A', 'null', 'nan', '1.#QNAN', 'NA'}, 'keep_default_na': True, 'true_values': None, 'false_values': None, 'converters': {}, 'dtype': None, 'thousands': None, 'comment': None, 'decimal': '.', 'usecols': None, 'verbose': False, 'skip_blank_lines': True, 'encoding_errors': 'strict', 'on_bad_lines': 0, 'dtype_backend': 'numpy', 'delim_whitespace': False, 'na_filter': True, 'float_precision': None, 'na_fvalues': set(), 'allow_leading_cols': True}
        self = <pandas.io.parsers.c_parser_wrapper.CParserWrapper object at 0x0000021042F017E0>
        src = <_io.TextIOWrapper name='C:\\Users\\coscia-lab\\Desktop\\cylinter\\csv\\._1.csv' mode='r' encoding='utf-8'>
        parsers = <module 'pandas._libs.parsers' from 'C:\\Users\\coscia-lab\\AppData\\Local\\miniforge3\\envs\\napari-env\\lib\\site-packages\\pandas\\_libs\\parsers.cp310-win_amd64.pyd'>
     95 self.unnamed_cols = self._reader.unnamed_cols
     97 # error: Cannot determine type of 'names'

File parsers.pyx:574, in pandas._libs.parsers.TextReader.__cinit__()

File parsers.pyx:663, in pandas._libs.parsers.TextReader._get_header()

File parsers.pyx:874, in pandas._libs.parsers.TextReader._tokenize_rows()

File parsers.pyx:891, in pandas._libs.parsers.TextReader._check_tokenize_status()

File parsers.pyx:2053, in pandas._libs.parsers.raise_parser_error()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
`
melonora commented 5 months ago

hmm are there some weird characters in the csv?

josenimo commented 5 months ago

I think from these lines:

File ~\AppData\Local\miniforge3\envs\napari-env\lib\site-packages\cell_gater\utils\csv_df.py:30, in stack_csv_files(csv_dir=WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv'))
     28 df = pd.DataFrame()
     29 for file in csv_files:
---> 30     df_file = pd.read_csv(file)
        file = WindowsPath('C:/Users/coscia-lab/Desktop/cylinter/csv/._1.csv')

that these files starting with periods are not supposed to be there, and since they have the .csv ending their are being read in.

melonora commented 5 months ago

yeah lol . in file name like that is invalid.

sfritzs commented 5 months ago

file in question has this:  Mac OS X  2°âATTR€À5âÀ.Àcom.apple.lastuseddate#PSÐcom.apple.quarantine«²³eéÊž90082;65b3b2ac;Microsoft Excel;This resource fork intentionally left blank ÿÿ

melonora commented 5 months ago

did you open these csvs before in excel?

melonora commented 5 months ago

as when saving this again this creates a whole lot of encoding that can't be read in standard.

sfritzs commented 5 months ago

nope, we did not have any idea this file existed

melonora commented 5 months ago

https://superuser.com/questions/28384/what-should-i-do-about-com-apple-quarantine

sfritzs commented 5 months ago

trying to delete file and run plugin to test

sfritzs commented 5 months ago

ok, it works, sorry for bringing this up

melonora commented 5 months ago

hehe no worries

josenimo commented 5 months ago

Just to close the issue, fixed in https://github.com/melonora/napari-cell-gater/pull/17/commits/8eb7b221241f614846dad45d2af83b3a2fe7704b