Training step - Githubissues

smurthy55 commented 3 years ago

Hi! I hope you are doing well.

I was following the steps listed in the tutorial for training the DeepHiC. At the training step (python training.py), I received the following error regarding an issue with connection:

WARNING:root:Setting up a new session... Exception in user code:

Traceback (most recent call last): File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen chunked=chunked, File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request conn.request(method, url, **httplib_request_kw) File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1244, in request self._send_request(method, url, body, headers, encode_chunked) File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1290, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1239, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1026, in _send_output self.send(msg) File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 966, in send self.connect() File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect conn = self._new_conn() File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn self, "Failed to establish a new connection: %s" % e urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f916229ae50>: Failed to establish a new connection: [Errno 61] Connection refused

Would you know how I could address this issue? Thanks so much for your help.

omegahh commented 3 years ago

As I can see, these errors are caused by the network connection. In this repository, only one step, loading the VGG16 model from torchvision, would try connecting the internet and downloading a pre-trained model file. How about running follow codes in you machine?

from torchvision.models.vgg import vgg16
vgg = vgg16(pretrained=True)

smurthy55 commented 3 years ago

Thanks so much! I am still learning how DeepHiC works and how we can use it with our data, and wanted to ask some follow-up questions:

I wanted to ask about the selection of parameters when following these steps for our own data. For example, how do you suggest we determine the downsampling factor we should use based on our data?
Also I notice that the GM12878 data is stored as .npz files by resolution and chromosome. Should we input our own data as .npz files per chromosome, or can the entire dataset at one resolution be stored as one .npz file?
When running the data_generate.py script, one of the parameters is “-s” which is related to the inputted dataset. When running with -c GM12878 dataset, the example command uses “all” for the -s parameter. However, this is not working when I run it using this parameter (“train” and “valid” seem to run completely). The error I receive with “-s all” is:

[murthys3@cn0862 DeepHiC]$ python data_generate.py -hr 10kb -lr 40kb -lrc 100 -s all -chunk 40 -stride 40 -bound 201 -scale 1 -c GM12878_primary Traceback (most recent call last): File "data_generate.py", line 49, in chr_list = set_dict[dataset] KeyError: 'all'

omegahh commented 3 years ago

I have made some analyses for determining downsampling factor for users' data. like Note S2, Fig S17. Hope these results helps you.
DeepHiC only focuses on intra-chromosomal data, so I think they should be separated by chromosomes.
Sorry for this error. options for -s should be "train/test/human/mouse". Option "all" has been deprecated. I forgot to update the instruction.

smurthy55 commented 3 years ago

Thanks so much for your help! Do you have suggestions on how to convert the prediction npz files to files that can be visualized in a map such as a .hic file?

omegahh commented 3 years ago

I use .npz file because it can be easily loaded in python (with numpy) without any conversion, and these compressed files save a lot of storage. And I use matplotlib package for Hi-C matrices visualization.

smurthy55 commented 3 years ago

Thanks so much for all your help. I wanted to ask about using relatively high resolution data for the DeepHiC pipeline. Do you think it still would work for these datasets?

It seems that when running the pipeline, the outputs are enhanced resolution maps of the lower resolution maps generated from the pipeline. Is it possible, if starting with an input of 10kb, to get an enhanced predicted output of 10kb, rather than an enhanced predicted output of the lower resolution data, such as 40kb? Perhaps I am misunderstanding the outputs or am not generating the outputs correctly.

omegahh commented 3 years ago

Yes, you are right. The low-resolution input of the model is actually still binned in 10kb, but it has lower sequencing depth than the real 10kb Hi-C data.

In short, both the input and output of our model is 10kb binned matrices, but the input has lower sequencing depth.

smurthy55 commented 3 years ago

Thanks for all your help. I have outputs from the DeepHiC method that I have visualized with Matplotlib, and I am seeing sharp edges where the squares are in the enhanced plots. Is there a way to smoothen or reduce the edges?

I am also currently trying to filter and normalize our data similar to how it was described in the paper to help fine tune the data and reduce the effect of outliers. I noticed how you used 255 as the average 99.9th threshold for your data and set values higher than this to 255. Was this 255 calculated by obtaining the 99.9th percentile for the numpy hic matrix for each chromosome and then obtaining the mean? If that is the case, it seems that from the raw GM12878 npz files generated in the pipeline the average 99.9th percentile would be much less (~78.6).

omegahh commented 3 years ago

Hello, smurthy55, it has been a long time. For question one, edges between divided blocks should be diminished if the training step is sufficient. We did observed edges if we used the model, for example, trained with 100 epochs. According to your description, we think we should update this model to avoid this problem. Maybe we could make the prediction in the whole matrix without splitting into blocks, but we need more tests for this, and I cannot ensure the time when this will be finished.

For question two, we used the processed Hi-C matrices from GEO. We noticed that the processed files were replaced with the .hic files currently, while they were compressed text files before. I am not sure whether they are the same data which just stored in different format. The following figure is the distribution of 99.x (x=1,3,5,7,9) percentiles for each chromosomes in 10kb GM12878 cell line data, as well as the downsampled data (40kb in the right)

Hope this helps you!

Omeiko commented 2 years ago

Hello, I had the same problem. And my network connection was OK. Here is my error Setting up a new session... Exception in user code:

Traceback (most recent call last): File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/connection.py", line 96, in create_connection raise err File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/connection.py", line 86, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen httplib_response = self._make_request( File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 394, in _make_request conn.request(method, url, **httplib_request_kw) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 239, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1252, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1298, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1247, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1007, in _send_output self.send(msg) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 947, in send self.connect() File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/adapters.py", line 439, in send resp = conn.urlopen( File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen retries = retries.increment( File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/0119-deephic (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/visdom/init.py", line 708, in _send return self._handle_post( File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/visdom/init.py", line 677, in _handle_post r = self.session.post(url, data=data) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 590, in post return self.request('POST', url, data=data, json=json, kwargs) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 542, in request resp = self.send(prep, send_kwargs) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 655, in send r = adapter.send(request, **kwargs) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/adapters.py", line 516, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/0119-deephic (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused')) [Errno 111] Connection refused on_close() takes 1 positional argument but 3 were given 0%| | 0/831 [00:00<?, ?it/s] Traceback (most recent call last): File "train.py", line 112, in g_loss.backward() File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward Variable._execution_engine.run_backward( RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 256, 1, 1]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

jinsooahn commented 2 years ago

I have been trying to solve these issues. I could solved '[Errno 111] Connection refused' after running the visdom server.

In my case, I run below after installing visdom and then navigate to http://localhost:8097 to see if it is working. python -m visdom.server &

Regarding the inplace operation error, I received some details by adding torch.autograd.set_detect_anomaly(True) in train.py. It says "The variable in question was changed in there or anywhere later". I am not an expert, but after doing some research I moved optimizerD.step() to below location, and then it worked well. The authors can correct me if I am wrong.

     ######### Train discriminator #########
    netD.zero_grad()
    real_out = netD(real_img)
    fake_out = netD(fake_img)
    d_loss_real = criterionD(real_out, torch.ones_like(real_out))
    d_loss_fake = criterionD(fake_out, torch.zeros_like(fake_out))
    d_loss = d_loss_real + d_loss_fake
    d_loss.backward(retain_graph=True)

    ######### Train generator #########
    netG.zero_grad()
    g_loss = criterionG(fake_out.mean(), fake_img, real_img)
    g_loss.backward()

    optimizerD.step()
    optimizerG.step()

omegahh / DeepHiC

Training step #4

WARNING:root:Setting up a new session... Exception in user code:

Hello, I had the same problem. And my network connection was OK. Here is my error Setting up a new session... Exception in user code: