open2c / cooltools

The tools for your .cool's
MIT License
140 stars 51 forks source link

fetch genomic feature issue when trying dots & focal enrichment #495

Closed yingsun-ucsd closed 9 months ago

yingsun-ucsd commented 9 months ago

I am new to cooltools and trying dots & focal enrichment by following exactly the example here. But at step [3] , which is just using bioframe to fetch the genomic feature from the UCSC. I got the following error message. I believe it's from hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens). Any suggestions? Thank.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_3747972/3688799565.py in <module>
      4 hg38_chromsizes = bioframe.fetch_chromsizes('hg38')
      5 hg38_cens = bioframe.fetch_centromeres('hg38')
----> 6 hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)

~/.local/lib/python3.8/site-packages/bioframe/extras.py in make_chromarms(chromsizes, midpoints, cols_chroms, cols_mids, suffixes)
    112             "chromosome split into more than two arms, double-check midpoints"
    113         )
--> 114     df_chromarms["name"] = df_chromarms[ck1] + [
    115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]

~/.local/lib/python3.8/site-packages/bioframe/extras.py in <listcomp>(.0)
    113         )
    114     df_chromarms["name"] = df_chromarms[ck1] + [
--> 115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]
    117     # df_chromarms.drop(columns=columns_to_drop, inplace=True)

TypeError: tuple indices must be integers or slices, not numpy.float64
sergpolly commented 9 months ago

Hi @yingsun-ucsd ! Yes this was an issue at some point in bioframe but it has been fixed recently, see related discussion here: https://github.com/open2c/bioframe/issues/175

So, i think if you update your bioframe you should be able to get past that error .

sergpolly commented 9 months ago

Also as a heads up - watch out for this one https://github.com/open2c/bioframe/issues/186 - as this would likely affect fetching chromsizes and centromeres from ucsc ...

yingsun-ucsd commented 9 months ago

Thank you for your reply.

print(bioframe.version) 0.6.1

But I still got the same error message. Which version should I look for? Thanks.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_3851668/945254712.py in <module>
      4 hg38_chromsizes = bioframe.fetch_chromsizes('hg38')
      5 hg38_cens = bioframe.fetch_centromeres('hg38')
----> 6 hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)
      7 
      8 hg38_chromsizes.head()

~/.local/lib/python3.8/site-packages/bioframe/extras.py in make_chromarms(chromsizes, midpoints, cols_chroms, cols_mids, suffixes)
    112             "chromosome split into more than two arms, double-check midpoints"
    113         )
--> 114     df_chromarms["name"] = df_chromarms[ck1] + [
    115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]

~/.local/lib/python3.8/site-packages/bioframe/extras.py in <listcomp>(.0)
    113         )
    114     df_chromarms["name"] = df_chromarms[ck1] + [
--> 115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]
    117     # df_chromarms.drop(columns=columns_to_drop, inplace=True)

TypeError: tuple indices must be integers or slices, not numpy.float64
yingsun-ucsd commented 9 months ago

I also tried "hg38_chromsizes = bioframe.fetch_chromsizes('hg38', provider='ucsc')" and got even more errors:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
/tmp/ipykernel_3851668/3964869406.py in <module>
      2 
      3 # Use bioframe to fetch the genomic features from the UCSC.
----> 4 hg38_chromsizes = bioframe.fetch_chromsizes('hg38', provider='ucsc')
      5 hg38_cens = bioframe.fetch_centromeres('hg38')
      6 #hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)

~/.local/lib/python3.8/site-packages/bioframe/io/resources.py in fetch_chromsizes(db, provider, as_bed, filter_chroms, chrom_patterns, natsort, **kwargs)
     96             return assembly.chromsizes
     97     elif provider == "ucsc":
---> 98         return UCSCClient(db).fetch_chromsizes(
     99             filter_chroms=filter_chroms,
    100             chrom_patterns=chrom_patterns,

~/.local/lib/python3.8/site-packages/bioframe/io/resources.py in fetch_chromsizes(self, filter_chroms, chrom_patterns, natsort, as_bed, **kwargs)
    262     ) -> Union[pd.Series, pd.DataFrame]:
    263         url = urljoin(self._db_url, f"bigZips/{self._db}.chrom.sizes")
--> 264         return read_chromsizes(
    265             url,
    266             filter_chroms=filter_chroms,

~/.local/lib/python3.8/site-packages/bioframe/io/fileops.py in read_chromsizes(filepath_or, filter_chroms, chrom_patterns, natsort, as_bed, **kwargs)
    127         kwargs.setdefault("compression", "gzip")
    128 
--> 129     chromtable = pd.read_csv(
    130         filepath_or,
    131         sep="\t",

~/.local/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    209                 else:
    210                     kwargs[new_arg_name] = new_arg_value
--> 211             return func(*args, **kwargs)
    212 
    213         return cast(F, wrapper)

~/.local/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    329                     stacklevel=find_stack_level(),
    330                 )
--> 331             return func(*args, **kwargs)
    332 
    333         # error: "Callable[[VarArg(Any), KwArg(Any)], Any]" has no

~/.local/lib/python3.8/site-packages/pandas/io/parsers/readers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
    948     kwds.update(kwds_defaults)
    949 
--> 950     return _read(filepath_or_buffer, kwds)
    951 
    952 

~/.local/lib/python3.8/site-packages/pandas/io/parsers/readers.py in _read(filepath_or_buffer, kwds)
    603 
    604     # Create the parser.
--> 605     parser = TextFileReader(filepath_or_buffer, **kwds)
    606 
    607     if chunksize or iterator:

~/.local/lib/python3.8/site-packages/pandas/io/parsers/readers.py in __init__(self, f, engine, **kwds)
   1440 
   1441         self.handles: IOHandles | None = None
-> 1442         self._engine = self._make_engine(f, self.engine)
   1443 
   1444     def close(self) -> None:

~/.local/lib/python3.8/site-packages/pandas/io/parsers/readers.py in _make_engine(self, f, engine)
   1733                 if "b" not in mode:
   1734                     mode += "b"
-> 1735             self.handles = get_handle(
   1736                 f,
   1737                 mode,

~/.local/lib/python3.8/site-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
    711 
    712     # open URLs
--> 713     ioargs = _get_filepath_or_buffer(
    714         path_or_buf,
    715         encoding=encoding,

~/.local/lib/python3.8/site-packages/pandas/io/common.py in _get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode, storage_options)
    361         # assuming storage_options is to be interpreted as headers
    362         req_info = urllib.request.Request(filepath_or_buffer, headers=storage_options)
--> 363         with urlopen(req_info) as req:
    364             content_encoding = req.headers.get("Content-Encoding", None)
    365             if content_encoding == "gzip":

~/.local/lib/python3.8/site-packages/pandas/io/common.py in urlopen(*args, **kwargs)
    263     import urllib.request
    264 
--> 265     return urllib.request.urlopen(*args, **kwargs)
    266 
    267 

/usr/lib/python3.8/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):

/usr/lib/python3.8/urllib/request.py in open(self, fullurl, data, timeout)
    529         for processor in self.process_response.get(protocol, []):
    530             meth = getattr(processor, meth_name)
--> 531             response = meth(req, response)
    532 
    533         return response

/usr/lib/python3.8/urllib/request.py in http_response(self, request, response)
    638         # request was successfully received, understood, and accepted.
    639         if not (200 <= code < 300):
--> 640             response = self.parent.error(
    641                 'http', request, response, code, msg, hdrs)
    642 

/usr/lib/python3.8/urllib/request.py in error(self, proto, *args)
    567         if http_err:
    568             args = (dict, 'default', 'http_error_default') + orig_args
--> 569             return self._call_chain(*args)
    570 
    571 # XXX probably also want an abstract factory that knows when it makes

/usr/lib/python3.8/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    500         for handler in handlers:
    501             func = getattr(handler, meth_name)
--> 502             result = func(*args)
    503             if result is not None:
    504                 return result

/usr/lib/python3.8/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Forbidden
sergpolly commented 9 months ago

this second one - is probably related to https://github.com/open2c/bioframe/issues/186

sergpolly commented 9 months ago

As far as the first error is concerned - could you please provide version of pandas and numpy you're using there ?

We strongly recommend people to use package management tools (such as conda) to help create isolated and more reproducible environments - otherwise such compatibility/versioning issues are very hard to debug and help with . If you continue having those version issue - could you please consider trying out conda and an installation method suggested here https://github.com/open2c/open2c_examples

yingsun-ucsd commented 9 months ago

print(sys.version) 3.8.10 print(bioframe._version_) print(pd._version_) #import pandas as pd print(np._version_) #import numpy as np 0.6.1 1.5.3 1.24.4

yingsun-ucsd commented 9 months ago

image This might be a stupid question. I did have this "name" issue. Do you know how I can get rid of it? And do you think it might work then? Thanks you so much for your help.

sergpolly commented 9 months ago

I'm afraid pandas might be out of data here .. But I'm afraid to suggest you to update it, because it might create more compatibilty/versioining issues with other package/modules you have installed. It would the best if you could try an isolated conda env just for open2c tools .

yingsun-ucsd commented 9 months ago

I'm afraid pandas might be out of data here .. But I'm afraid to suggest you to update it, because it might create more compatibilty/versioining issues with other package/modules you have installed. It would the best if you could try an isolated conda env just for open2c tools .

Sure. I will try that and get back to you soon. Thank you so much for your help.

sergpolly commented 9 months ago

regarding the name - you could try https://stackoverflow.com/questions/29765548/remove-index-name-in-pandas But I don't think it would help

hg38_chromsizes.index.name = None
yingsun-ucsd commented 9 months ago

regarding the name - you could try https://stackoverflow.com/questions/29765548/remove-index-name-in-pandas But I don't think it would help

hg38_chromsizes.index.name = None

No, it did not.

yingsun-ucsd commented 9 months ago

@sergpolly I followed https://github.com/open2c/open2c_examples and tried contacts_vs_distance.ipynb. However, I got the same error message.

# Use bioframe to fetch the genomic features from the UCSC.
hg38_chromsizes = bioframe.fetch_chromsizes('hg38')
hg38_cens = bioframe.fetch_centromeres('hg38')
# create a view with chromosome arms using chromosome sizes and definition of centromeres
hg38_arms = bioframe.make_chromarms(hg38_chromsizes,  hg38_cens)

# select only those chromosomes available in cooler
hg38_arms = hg38_arms[hg38_arms.chrom.isin(clr.chromnames)].reset_index(drop=True)
hg38_arms

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_3879691/3462156594.py in <module>
      3 hg38_cens = bioframe.fetch_centromeres('hg38')
      4 # create a view with chromosome arms using chromosome sizes and definition of centromeres
----> 5 hg38_arms = bioframe.make_chromarms(hg38_chromsizes,  hg38_cens)
      6 
      7 # select only those chromosomes available in cooler

~/.local/lib/python3.8/site-packages/bioframe/extras.py in make_chromarms(chromsizes, midpoints, cols_chroms, cols_mids, suffixes)
    112             "chromosome split into more than two arms, double-check midpoints"
    113         )
--> 114     df_chromarms["name"] = df_chromarms[ck1] + [
    115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]

~/.local/lib/python3.8/site-packages/bioframe/extras.py in <listcomp>(.0)
    113         )
    114     df_chromarms["name"] = df_chromarms[ck1] + [
--> 115         suffixes[i] for i in df_chromarms["sub_index_"].values
    116     ]
    117     # df_chromarms.drop(columns=columns_to_drop, inplace=True)

TypeError: tuple indices must be integers or slices, not numpy.float64
sergpolly commented 9 months ago

looking at this ~/.local/lib/python3.8/site-packages/biofram - it looks like you're still running your old "environment" with Python/cooltools/bioframe .

The idea was to switch to Python-environment provided by conda

We've recently went through the trial and error of conda installation with another user https://github.com/open2c/cooltools/issues/494 - could you please try to look for clues there as well

yingsun-ucsd commented 9 months ago

looking at this ~/.local/lib/python3.8/site-packages/biofram - it looks like you're still running your old "environment" with Python/cooltools/bioframe .

The idea was to switch to Python-environment provided by conda

I did activate the conda (open2c-new) ysun@ophelia:/nfs/lab/ysun/4DN$ /home/ysun/anaconda3/bin/jupyter notebook --no-browser --port 3185

sergpolly commented 9 months ago

ok , then while in that notebook - can you check

import bioframe
bioframe.__path__
bioframe.__version__

import cooltools
cooltools.__path__
cooltools.__version__

import pandas as pd
pd.__version
pd.__path__

the __path__ part is supposed to tell where ther package is actually coming from

yingsun-ucsd commented 9 months ago
import bioframe
bioframe.__path__
bioframe.__version__

import cooltools
cooltools.__path__
cooltools.__version__

import pandas as pd
pd.__version
pd.__path__

['/home/ysun/.local/lib/python3.8/site-packages/bioframe'] 0.6.1 ['/home/ysun/.local/lib/python3.8/site-packages/cooltools'] 0.6.1 ['/home/ysun/.local/lib/python3.8/site-packages/pandas'] 1.5.3

You are RIGHT! It still used the old one instead of ~/anaconda3/envs/open2c/lib/python3.9/site-packages even I activated conda. Do you know how to for the notebook use the correct one? Thanks.

sergpolly commented 9 months ago

The way you start your notebook seems fine ... I don't think you need to wriute the whole path to the jupyter though - what does it say when you try which jupyter (in unix command line/terminal) ? are you sure you've connected to the new notebook at port 3185 in your browser ?

Also could you check versions/locations of the command line commands of cooltools after activating the environment ?

which cooltools
which python

They should be from conda , not from your old .local installation

yingsun-ucsd commented 9 months ago

The way you start your notebook seems fine ... I don't think you need to wriute the whole path to the jupyter though - what does it say when you try which jupyter (in unix command line/terminal) ? are you sure you've connected to the new notebook at port 3185 in your browser ?

Also could you check versions/locations of the command line commands of cooltools after activating the environment ?

which cooltools
which python

They should be from conda , not from your old .local installation

I think the cooltools somehow is using a wrong path

(base) ysun@ophelia:/nfs/lab/ysun/4DN$ conda activate open2c (open2c) ysun@ophelia:/nfs/lab/ysun/4DN$ which jupyter /home/ysun/anaconda3/envs/open2c/bin/jupyter (open2c) ysun@ophelia:/nfs/lab/ysun/4DN$ which cooltools /home/ysun/.local/bin/cooltools (open2c) ysun@ophelia:/nfs/lab/ysun/4DN$ which python /home/ysun/anaconda3/envs/open2c/bin/python

sergpolly commented 9 months ago

hm, indeed ! I'm wondering if cooltools actually got installed there at all ?

Were there any error during installation ?

To make sure - could you check if cooltools is in the environment folder: /home/ysun/anaconda3/envs/open2c/bin ?

sergpolly commented 9 months ago

the conda environment installation itself should be as easy as:

conda env create -f environment.yml

where environment.yml is the file that describes what needs to be installed - the one that is provided by open2c_examples - you can download that file even without cloning the open2c_examples repo - e.g.

# download environment.yml in your current folder
wget https://raw.githubusercontent.com/open2c/open2c_examples/master/environment.yml

You can modify that file itself - e.g. change the name of the environment , remove some of the programs - e.g. pysam and pairtools in case you're going to work on binned data only - to make the environment lighter

yingsun-ucsd commented 9 months ago

/home/ysun/anaconda3/envs/open2c/bin

Seems it's here: (open2c) ysun@ophelia:/nfs/lab/ysun/4DN$ ls /home/ysun/anaconda3/envs/open2c/bin/cool* /home/ysun/anaconda3/envs/open2c/bin/cooler /home/ysun/anaconda3/envs/open2c/bin/cooltools

The installation seems fine too. What do you think?

(open2c-new) ysun@ophelia:/nfs/lab/ysun/4DN$ pip install cooltools Defaulting to user installation because normal site-packages is not writeable Requirement already satisfied: cooltools in /home/ysun/.local/lib/python3.8/site-packages (0.6.1) Requirement already satisfied: bioframe>=0.4.1 in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (0.6.1) Requirement already satisfied: click>=7 in /usr/lib/python3/dist-packages (from cooltools) (7.0) Requirement already satisfied: cooler>=0.9.1 in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (0.9.3) Requirement already satisfied: cython in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (3.0.8) Requirement already satisfied: joblib in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (1.3.2) Requirement already satisfied: matplotlib in /usr/local/lib/python3.8/dist-packages (from cooltools) (3.4.3) Requirement already satisfied: multiprocess in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (0.70.16) Requirement already satisfied: numba in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (0.58.1) Requirement already satisfied: numpy in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (1.24.4) Requirement already satisfied: pandas<2,>=1.5.1 in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (1.5.3) Requirement already satisfied: scikit-learn>=1.1.2 in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (1.3.2) Requirement already satisfied: scipy in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (1.9.1) Requirement already satisfied: scikit-image in /home/ysun/.local/lib/python3.8/site-packages (from cooltools) (0.20.0) Requirement already satisfied: pyyaml in /home/ysun/.local/lib/python3.8/site-packages (from bioframe>=0.4.1->cooltools) (5.4.1) Requirement already satisfied: requests in /home/ysun/.local/lib/python3.8/site-packages (from bioframe>=0.4.1->cooltools) (2.28.2) Requirement already satisfied: typing-extensions in /home/ysun/.local/lib/python3.8/site-packages (from bioframe>=0.4.1->cooltools) (4.5.0) Requirement already satisfied: asciitree in /home/ysun/.local/lib/python3.8/site-packages (from cooler>=0.9.1->cooltools) (0.3.3) Requirement already satisfied: cytoolz in /home/ysun/.local/lib/python3.8/site-packages (from cooler>=0.9.1->cooltools) (0.12.3) Requirement already satisfied: h5py>=2.5 in /usr/local/lib/python3.8/dist-packages (from cooler>=0.9.1->cooltools) (3.4.0) Requirement already satisfied: pyfaidx in /home/ysun/.local/lib/python3.8/site-packages (from cooler>=0.9.1->cooltools) (0.8.1.1) Requirement already satisfied: simplejson in /usr/lib/python3/dist-packages (from cooler>=0.9.1->cooltools) (3.16.0) Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.8/dist-packages (from pandas<2,>=1.5.1->cooltools) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.8/dist-packages (from pandas<2,>=1.5.1->cooltools) (2021.1) Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from scikit-learn>=1.1.2->cooltools) (2.2.0) Requirement already satisfied: cycler>=0.10 in /home/ysun/.local/lib/python3.8/site-packages (from matplotlib->cooltools) (0.11.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->cooltools) (1.3.2) Requirement already satisfied: pillow>=6.2.0 in /home/ysun/.local/lib/python3.8/site-packages (from matplotlib->cooltools) (9.4.0) Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->cooltools) (2.4.7) Requirement already satisfied: dill>=0.3.8 in /home/ysun/.local/lib/python3.8/site-packages (from multiprocess->cooltools) (0.3.8) Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in /home/ysun/.local/lib/python3.8/site-packages (from numba->cooltools) (0.41.1) Requirement already satisfied: importlib-metadata in /home/ysun/.local/lib/python3.8/site-packages (from numba->cooltools) (6.8.0) Requirement already satisfied: networkx>=2.8 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (3.0) Requirement already satisfied: imageio>=2.4.1 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (2.26.0) Requirement already satisfied: tifffile>=2019.7.26 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (2023.2.28) Requirement already satisfied: PyWavelets>=1.1.1 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (1.4.1) Requirement already satisfied: packaging>=20.0 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (23.0) Requirement already satisfied: lazy_loader>=0.1 in /home/ysun/.local/lib/python3.8/site-packages (from scikit-image->cooltools) (0.1) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.1->pandas<2,>=1.5.1->cooltools) (1.14.0) Requirement already satisfied: toolz>=0.8.0 in /home/ysun/.local/lib/python3.8/site-packages (from cytoolz->cooler>=0.9.1->cooltools) (0.12.0) Requirement already satisfied: zipp>=0.5 in /usr/lib/python3/dist-packages (from importlib-metadata->numba->cooltools) (1.0.0) Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from pyfaidx->cooler>=0.9.1->cooltools) (45.2.0) Requirement already satisfied: charset-normalizer<4,>=2 in /home/ysun/.local/lib/python3.8/site-packages (from requests->bioframe>=0.4.1->cooltools) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests->bioframe>=0.4.1->cooltools) (2.8) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ysun/.local/lib/python3.8/site-packages (from requests->bioframe>=0.4.1->cooltools) (1.26.14) Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests->bioframe>=0.4.1->cooltools) (2019.11.28) DEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063 DEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063 (open2c-new) ysun@ophelia:/nfs/lab/ysun/4DN$

yingsun-ucsd commented 9 months ago

the conda environment installation itself should be as easy as:

conda env create -f environment.yml

where environment.yml is the file that describes what needs to be installed - the one that is provided by open2c_examples - you can download that file even without cloning the open2c_examples repo - e.g.

# download environment.yml in your current folder
wget https://raw.githubusercontent.com/open2c/open2c_examples/master/environment.yml

You can modify that file itself - e.g. change the name of the environment , remove some of the programs - e.g. pysam and pairtools in case you're going to work on binned data only - to make the environment lighter

I will try it again and keep you posted on how it goes. Thanks again for all your help.

sergpolly commented 9 months ago

here - in the very beginning of the installation :

(open2c-new) ysun@ophelia:/nfs/lab/ysun/4DN$ pip install cooltools
Defaulting to user installation because normal site-packages is not writeable

it says it is deafulting to your old user installation ... because something is not writable. It is hard to understand why/what is that without more context.

But the point is that you shouldn't be doing pip install cooltools at all - it should be done during the conda-environment installation - check the content of that environment.yml file - cooltools is already there.

Also make sure to check which pip was it trying to use ? it is supposed to be the pip from the conda not your .local one.

yingsun-ucsd commented 9 months ago

here - in the very beginning of the installation :

(open2c-new) ysun@ophelia:/nfs/lab/ysun/4DN$ pip install cooltools
Defaulting to user installation because normal site-packages is not writeable

it says it is deafulting to your old user installation ... because something is not writable. It is hard to understand why/what is that without more context.

But the point is that you shouldn't be doing pip install cooltools at all - it should be done during the conda-environment installation - check the content of that environment.yml file - cooltools is already there.

Also make sure to check which pip was it trying to use ? it is supposed to be the pip from the conda not your .local one.

Thank you so much! I "pip uninstall cooltools" and "conda env create -f environment.yml". Everything seems OK. And I checked:

(open2c-local) ysun@ophelia:~/.local/bin$ which cooltools
/home/ysun/anaconda3/envs/open2c-local/bin/cooltools
(open2c-local) ysun@ophelia:~/.local/bin$ which jupyter
/home/ysun/anaconda3/envs/open2c-local/bin/jupyter
(open2c-local) ysun@ophelia:~/.local/bin$ which python
/home/ysun/anaconda3/envs/open2c-local/bin/python

But in the notebook:

ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_217973/2640650816.py in <module>
      7 import cooler
      8 import bioframe
----> 9 import cooltools
     10 from cooltools.lib.numutils import fill_diag
     11 from packaging import version

ModuleNotFoundError: No module named 'cooltools'

Do you know how I could force the path to cooltools? Thanks.

sergpolly commented 9 months ago

Could you please confirm the __path__ of cooler and bioframe, while in the notebook ?

My suspicion is that you're satill importing your old .local python packages ...

yingsun-ucsd commented 9 months ago

Could you please confirm the __path__ of cooler and bioframe, while in the notebook ?

My suspicion is that you're satill importing your old .local python packages ...

import bioframe
print(bioframe.__path__)
print(bioframe.__version__)

import cooler
print(cooler.__path__)
print(cooler.__version__)

['/home/ysun/.local/lib/python3.8/site-packages/bioframe'] 0.6.1 ['/home/ysun/.local/lib/python3.8/site-packages/cooler'] 0.9.3

Kept reading from the wrong folder I think but don't know why? 😓

sergpolly commented 9 months ago

I know - this is puzzling a bit - and hard to debug remotely Couold you confirm inside the notebook: ! conda env list? it is a unix command , which you can run directly in the notebook when you add ! in front

yingsun-ucsd commented 9 months ago

! conda env list

# conda environments:
#
jupyter                  /home/ysun/.conda/envs/jupyter
reticulate               /home/ysun/.conda/envs/reticulate
                         /home/ysun/.local/share/r-miniconda
                         /home/ysun/.local/share/r-miniconda/envs/r-reticulate
base                     /home/ysun/anaconda3
call_peaks               /home/ysun/anaconda3/envs/call_peaks
open2c                   /home/ysun/anaconda3/envs/open2c
open2c-local          *  /home/ysun/anaconda3/envs/open2c-local
open2c-new               /home/ysun/anaconda3/envs/open2c-new
reticulate-new           /home/ysun/anaconda3/envs/reticulate-new
                         /home/ysun/miniconda3
                         /home/ysun/miniconda3/envs/conda_clone_from_parul
                         /home/ysun/miniconda3/envs/encode-atac-seq-pipeline
                         /home/ysun/miniconda3/envs/encode-atac-seq-pipeline-macs2
                         /home/ysun/miniconda3/envs/encode-atac-seq-pipeline-python2
                         /home/ysun/miniconda3/envs/encode-atac-seq-pipeline-spp
                         /home/ysun/miniconda3/envs/encode-chip-seq-pipeline
                         /home/ysun/miniconda3/envs/encode-chip-seq-pipeline-macs2
                         /home/ysun/miniconda3/envs/encode-chip-seq-pipeline-spp
                         /home/ysun/miniconda3/envs/merlin_env
                         /home/ysun/miniconda3/envs/scanpy_env
sergpolly commented 9 months ago

could you try to follow some of what they discuss here https://github.com/jupyter/notebook/issues/3311

I'm a bit busy today, and cannot do a hands on debugging - sorry

yingsun-ucsd commented 9 months ago

could you try to follow some of what they discuss here jupyter/notebook#3311

I'm a bit busy today, and cannot do a hands on debugging - sorry

Understand. Thank you so much!

sergpolly commented 9 months ago

I had one additional thought - we want to narrow the problem down - is it jupyter to blame or more than that ?

To do this it would be nice to check where do you import cooler/bioframe/cooltools when you do this in python interpreter directly: right in the terminal on your cluster/computer -> make sure it is the right python - which python, then run it python -> and right in that interactive python session try importing , and checking __path__ - it is not as convenient , as jupyter, but it is important to check that. After that check if you have ipython installed in the environment (it is interactive python interpreter, that is related to jupyter but 100% in terminal) -> and try the same thing there which ipython, then test imports, and __path__ If they are all coming from conda, then we know -> that there is a problem with the jupyter configuration (like they describe in some of those issues)

yingsun-ucsd commented 9 months ago

I had one additional thought - we want to narrow the problem down - is it jupyter to blame or more than that ?

To do this it would be nice to check where do you import cooler/bioframe/cooltools when you do this in python interpreter directly: right in the terminal on your cluster/computer -> make sure it is the right python - which python, then run it python -> and right in that interactive python session try importing , and checking __path__ - it is not as convenient , as jupyter, but it is important to check that. After that check if you have ipython installed in the environment (it is interactive python interpreter, that is related to jupyter but 100% in terminal) -> and try the same thing there which ipython, then test imports, and __path__ If they are all coming from conda, then we know -> that there is a problem with the jupyter configuration (like they describe in some of those issues)

Sure, I will definitely do that. But since I have an urgent task on hand. I am trying to run python interpreter directly in the terminal. How can I save the file using --output?

import pandas as pd
import numpy as np
from itertools import chain

# Hi-C utilities imports:
import cooler
import bioframe
import cooltools
from cooltools.lib.numutils import fill_diag
from packaging import version
if version.parse(cooltools.__version__) < version.parse('0.5.2'):
    raise AssertionError("tutorials rely on cooltools version 0.5.2 or higher,"+
                         "please check your cooltools version and update to the latest")

# Visualization imports:
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
import matplotlib.patches as patches
from matplotlib.ticker import EngFormatter

# helper functions for plotting
bp_formatter = EngFormatter('b')
def format_ticks(ax, x=True, y=True, rotate=True):
    """format ticks with genomic coordinates as human readable"""
    if y:
        ax.yaxis.set_major_formatter(bp_formatter)
    if x:
        ax.xaxis.set_major_formatter(bp_formatter)
        ax.xaxis.tick_bottom()
    if rotate:
        ax.tick_params(axis='x',rotation=45)

# Load the data and define cooler:
dir = '/nfs/lab/ysun/4DN'
project = '4DNESRJ8KV4Q'
sample = '4DNFI7JNCNFB'

cool_file = dir+'/'+project+'/'+sample+'.mcool'
print(cool_file)

# 10 kb is a resolution at which one can clearly see "dots":
binsize = 10_000

# Open cool file with Micro-C data:
clr = cooler.Cooler(f'{cool_file}::/resolutions/{binsize}')

# define genomic view that will be used to call dots and pre-compute expected

# Use bioframe to fetch the genomic features from the UCSC.
hg38_chromsizes = bioframe.fetch_chromsizes('hg38')
hg38_cens = bioframe.fetch_centromeres('hg38')
hg38_arms = bioframe.make_chromarms(hg38_chromsizes, hg38_cens)

# Select only chromosomes that are present in the cooler.
hg38_arms = hg38_arms.set_index("chrom").loc[clr.chromnames].reset_index()

# intra-arm expected
expected = cooltools.expected_cis(
    clr,
    view_df=hg38_arms,
    nproc=4,
)

dots_df = cooltools.dots(
    clr,
    expected=expected,
    view_df=hg38_arms,
    # how far from the main diagonal to call dots:
    max_loci_separation=10_000_000,
    nproc=4,
)

How can I save the bedpe-like format -o, --output Required Specify output file name to store called dots in a BEDPE-like format output

nvictus commented 9 months ago

Use the source, Luke :)

yingsun-ucsd commented 9 months ago

Use the source, Luke :)

@nvictus Got it. Thanks.

gfudenberg commented 8 months ago

this actually may have been due to: https://github.com/open2c/bioframe/issues/189 now fixed in bioframe v0.6.2 https://github.com/open2c/bioframe/blob/da0d4657c72d57d30c60d4ace8a4942c30600ceb/bioframe/ops.py#L513

xiasijian commented 1 week ago

conda install --use-local bioframe-0.7.2-pyhdfd78af_0.tar.bz2