nloyfer / wgbs_tools

tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
Other
125 stars 33 forks source link

Find markers error #51

Closed gibberwocky closed 10 months ago

gibberwocky commented 10 months ago

I'm running into an error with find_markers which looks like an issue when trying to add an integer to a string?

` Number of markers found: 15 Traceback (most recent call last): File "./wgbstools", line 97, in main() File "./wgbstools", line 64, in main importlib.import_module(args.command).main() File "./wgbs_tools/src/python/find_markers.py", line 434, in main MarkerFinder(params).run() File "./wgbs_tools/src/python/find_markers.py", line 114, in run self.dump_results(self.res[target].reset_index(drop=True)) File "./wgbs_tools/src/python/find_markers.py", line 378, in dump_results tf['region'] = bed2reg(tf) File "./wgbs_tools/src/python/utils_wgbs.py", line 456, in bed2reg return df['chr'] + ':' + df['start'].astype(str) + '-' + df['end'].astype(str)


  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/ops/common.py", line 72, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/arraylike.py", line 102, in __add__
    return self._arith_method(other, operator.add)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/series.py", line 6259, in _arith_method
    return base.IndexOpsMixin._arith_method(self, other, op)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/base.py", line 1325, in _arith_method
    result = ops.arithmetic_op(lvalues, rvalues, op)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/ops/array_ops.py", line 226, in arithmetic_op
    res_values = _na_arithmetic_op(left, right, op)  # type: ignore[arg-type]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/myconda/envs/meth/lib/python3.11/site-packages/pandas/core/ops/array_ops.py", line 165, in _na_arithmetic_op
    result = func(left, right)
             ^^^^^^^^^^^^^^^^^
numpy.core._exceptions._UFuncNoLoopError: ufunc 'add' did not contain a loop with signature matching types (dtype('int64'), dtype('<U1')) -> None
`

Any thoughts?
gibberwocky commented 10 months ago

So, looks like this is caused by my chromosomes being numeric, i.e. '1' rather than 'chr1'. Returning df['chr'] as a string in utils_wgbs.py bed2ref appears to work:

def bed2reg(df): if not set(COORDS_COLS3).issubset(set(df.columns)): raise IllegalArgumentError('[wt] missing coordinate columns in bed file') return df['chr'].astype(str)** + ':' + df['start'].astype(str) + '-' + df['end'].astype(str)

nloyfer commented 10 months ago

Fixed is like you suggested. Thank you