wangjr03 / FLAMINGO

MIT License
14 stars 10 forks source link

`domain_res` problems #4

Closed shenlinyong closed 2 years ago

shenlinyong commented 2 years ago

frag_res = xxx mean at xxxkb resolution the 3D model was built? domain_res: Size of the domains in bps mean the resolution of the domains? What should I set its size based on? I don't quite understand?

haowang0508 commented 2 years ago

Hi,

frag_res refers to the final resolution of the 3D chromatin structures, i.e. 5000 (5kb) or 1000 (1kb). domain_res refers to the resolution of the larger domain in the hierarchical reconstruction procedure, it should be larger than frag_res, I would suggest using 1000000 (1mb) or 500000 (500kb). Usually keeping domain_res /frag_res ~ 100 is a good choice.

shenlinyong commented 2 years ago

Hello,

Thank you very much for your answer. But I have a new error reported when I run the software, even though 191 Genomic_loc_domain_xxx.txt files have been generated.

The following is the code I ran:

library(FLAMINGOr)
library(GenomicFeatures)
library(Matrix)

getwd()
setwd("/home/SLY68/2022/hic/juicer/down_analysis/raw/hic")

all_size <- read.table("/storage/SLY68/2022/hic/juicer/restriction_sites/GRCg7b_genomic.size")
for(i in 32:72){
  chr_name=as.character(all_size[1,1])
  chr_size = all_size[1,2]
  res = flamingo.main_func_large(hic_data_low='./fat.hic',
                               file_format='hic',
                               domain_res=1e6,frag_res=5e3,
                               chr_size=chr_size,
                               chr_name=chr_name,
                               normalization='KR',
                               downsampling_rates=0.75,
                               lambda=10,max_dist=0.01,nThread=90,n_row=30000)

Here is the log file with the error report:

 1 [1] "/storage/SLY68/2022/hic/juicer/down_analysis/raw/hic"
      2 [1] "Contact map is too large, large matrix mod is on"
      3 [1] "Dividing domains..."
      4 [1] "Processing Fragments..."
      5 [1] "caching datasets..."
      6 x being coerced from class: matrix to data.table
      7 x being coerced from class: matrix to data.table
      ...
     84 x being coerced from class: matrix to data.table
     85 x being coerced from class: matrix to data.table
     86 [1] "Reconstructing backbones..."
     87 Loading required package: nlme
     88 This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.
     89 [1] "Reconstructing intra-domain structures..."
     90 Error in checkForRemoteErrors(val) : 
     91   one node produced an error: cannot open the connection
     92 Calls: flamingo.main_func_large ... clusterApply -> staticClusterApply -> checkForRemoteErrors
     93 In addition: Warning message:
     94 In dir.create("Domain_data") : 'Domain_data' already exists
     95 Execution halted

你好,

frag_res 是指 3D 染色质结构的最终分辨率,即 5000 (5kb) 或 1000 (1kb)。 domain_res是指层次重构过程中较大域的分辨率,应该大于frag_res,我建议使用1000000(1mb)或500000(500kb)。通常保持 domain_res /frag_res ~ 100 是一个不错的选择。

haowang0508 commented 2 years ago

Hi,

Can you check the number of domains? You can go to ./Domain_data to find the largest index of the Domain. The problem might be the number of domain is smaller than your number of threads so some nodes report the error. If so, please try to lower the number of the nThread parameter.

On Tue, Jun 28, 2022 at 6:16 PM shenlinyong @.***> wrote:

Hello,

Thank you very much for your answer. But I have a new error reported when I run the software, even though 191 Genomic_loc_domain_xxx.txt files have been generated.

The following is the code I ran:

library(FLAMINGOr)

library(GenomicFeatures)

library(Matrix)

getwd()

setwd("/home/SLY68/2022/hic/juicer/down_analysis/raw/hic")

all_size <- read.table("/storage/SLY68/2022/hic/juicer/restriction_sites/GRCg7b_genomic.size")

for(i in 32:72){

chr_name=as.character(all_size[1,1])

chr_size = all_size[1,2]

res = flamingo.main_func_large(hic_data_low='./fat.hic',

                           file_format='hic',

                           domain_res=1e6,frag_res=5e3,

                           chr_size=chr_size,

                           chr_name=chr_name,

                           normalization='KR',

                           downsampling_rates=0.75,

                           lambda=10,max_dist=0.01,nThread=90,n_row=30000)

Here is the log file with the error report:

1 [1] "/storage/SLY68/2022/hic/juicer/down_analysis/raw/hic"

  2 [1] "Contact map is too large, large matrix mod is on"

  3 [1] "Dividing domains..."

  4 [1] "Processing Fragments..."

  5 [1] "caching datasets..."

  6 x being coerced from class: matrix to data.table

  7 x being coerced from class: matrix to data.table

  ...

 84 x being coerced from class: matrix to data.table

 85 x being coerced from class: matrix to data.table

 86 [1] "Reconstructing backbones..."

 87 Loading required package: nlme

 88 This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.

 89 [1] "Reconstructing intra-domain structures..."

 90 Error in checkForRemoteErrors(val) :

 91   one node produced an error: cannot open the connection

 92 Calls: flamingo.main_func_large ... clusterApply -> staticClusterApply -> checkForRemoteErrors

 93 In addition: Warning message:

 94 In dir.create("Domain_data") : 'Domain_data' already exists

 95 Execution halted

你好,

frag_res 是指 3D 染色质结构的最终分辨率,即 5000 (5kb) 或 1000 (1kb)。 domain_res是指层次重构过程中较大域的分辨率,应该大于frag_res,我建议使用1000000(1mb)或500000(500kb)。通常保持 domain_res /frag_res ~ 100 是一个不错的选择。

— Reply to this email directly, view it on GitHub https://github.com/wangjr03/FLAMINGO/issues/4#issuecomment-1169337827, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHBJJCPN7UXXF7IKDSAX2TVRN2TTANCNFSM52CM6DMA . You are receiving this because you commented.Message ID: @.***>

haowang0508 commented 2 years ago

And also it seems like the looping variable i is not used in the loop. Is this problem observed for the first iteration or the following ones. Also if you run this code in the same directory again and again, the cached intermediate file may disrupt the loop. Please change your working directory or clean the intermediate files.

On Tue, Jun 28, 2022 at 7:09 PM HAO WANG @.***> wrote:

Hi,

Can you check the number of domains? You can go to ./Domain_data to find the largest index of the Domain. The problem might be the number of domain is smaller than your number of threads so some nodes report the error. If so, please try to lower the number of the nThread parameter.

On Tue, Jun 28, 2022 at 6:16 PM shenlinyong @.***> wrote:

Hello,

Thank you very much for your answer. But I have a new error reported when I run the software, even though 191 Genomic_loc_domain_xxx.txt files have been generated.

The following is the code I ran:

library(FLAMINGOr)

library(GenomicFeatures)

library(Matrix)

getwd()

setwd("/home/SLY68/2022/hic/juicer/down_analysis/raw/hic")

all_size <- read.table("/storage/SLY68/2022/hic/juicer/restriction_sites/GRCg7b_genomic.size")

for(i in 32:72){

chr_name=as.character(all_size[1,1])

chr_size = all_size[1,2]

res = flamingo.main_func_large(hic_data_low='./fat.hic',

                           file_format='hic',

                           domain_res=1e6,frag_res=5e3,

                           chr_size=chr_size,

                           chr_name=chr_name,

                           normalization='KR',

                           downsampling_rates=0.75,

                           lambda=10,max_dist=0.01,nThread=90,n_row=30000)

Here is the log file with the error report:

1 [1] "/storage/SLY68/2022/hic/juicer/down_analysis/raw/hic"

  2 [1] "Contact map is too large, large matrix mod is on"

  3 [1] "Dividing domains..."

  4 [1] "Processing Fragments..."

  5 [1] "caching datasets..."

  6 x being coerced from class: matrix to data.table

  7 x being coerced from class: matrix to data.table

  ...

 84 x being coerced from class: matrix to data.table

 85 x being coerced from class: matrix to data.table

 86 [1] "Reconstructing backbones..."

 87 Loading required package: nlme

 88 This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.

 89 [1] "Reconstructing intra-domain structures..."

 90 Error in checkForRemoteErrors(val) :

 91   one node produced an error: cannot open the connection

 92 Calls: flamingo.main_func_large ... clusterApply -> staticClusterApply -> checkForRemoteErrors

 93 In addition: Warning message:

 94 In dir.create("Domain_data") : 'Domain_data' already exists

 95 Execution halted

你好,

frag_res 是指 3D 染色质结构的最终分辨率,即 5000 (5kb) 或 1000 (1kb)。 domain_res是指层次重构过程中较大域的分辨率,应该大于frag_res,我建议使用1000000(1mb)或500000(500kb)。通常保持 domain_res /frag_res ~ 100 是一个不错的选择。

— Reply to this email directly, view it on GitHub https://github.com/wangjr03/FLAMINGO/issues/4#issuecomment-1169337827, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHBJJCPN7UXXF7IKDSAX2TVRN2TTANCNFSM52CM6DMA . You are receiving this because you commented.Message ID: @.***>

haowang0508 commented 2 years ago

Hello,

Thank you very much for your answer. But I have a new error reported when I run the software, even though 191 Genomic_loc_domain_xxx.txt files have been generated.

The following is the code I ran:

library(FLAMINGOr)
library(GenomicFeatures)
library(Matrix)

getwd()
setwd("/home/SLY68/2022/hic/juicer/down_analysis/raw/hic")

all_size <- read.table("/storage/SLY68/2022/hic/juicer/restriction_sites/GRCg7b_genomic.size")
for(i in 32:72){
  chr_name=as.character(all_size[1,1])
  chr_size = all_size[1,2]
  res = flamingo.main_func_large(hic_data_low='./fat.hic',
                               file_format='hic',
                               domain_res=1e6,frag_res=5e3,
                               chr_size=chr_size,
                               chr_name=chr_name,
                               normalization='KR',
                               downsampling_rates=0.75,
                               lambda=10,max_dist=0.01,nThread=90,n_row=30000)

Here is the log file with the error report:

 1 [1] "/storage/SLY68/2022/hic/juicer/down_analysis/raw/hic"
      2 [1] "Contact map is too large, large matrix mod is on"
      3 [1] "Dividing domains..."
      4 [1] "Processing Fragments..."
      5 [1] "caching datasets..."
      6 x being coerced from class: matrix to data.table
      7 x being coerced from class: matrix to data.table
      ...
     84 x being coerced from class: matrix to data.table
     85 x being coerced from class: matrix to data.table
     86 [1] "Reconstructing backbones..."
     87 Loading required package: nlme
     88 This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.
     89 [1] "Reconstructing intra-domain structures..."
     90 Error in checkForRemoteErrors(val) : 
     91   one node produced an error: cannot open the connection
     92 Calls: flamingo.main_func_large ... clusterApply -> staticClusterApply -> checkForRemoteErrors
     93 In addition: Warning message:
     94 In dir.create("Domain_data") : 'Domain_data' already exists
     95 Execution halted

你好, frag_res 是指 3D 染色质结构的最终分辨率,即 5000 (5kb) 或 1000 (1kb)。 domain_res是指层次重构过程中较大域的分辨率,应该大于frag_res,我建议使用1000000(1mb)或500000(500kb)。通常保持 domain_res /frag_res ~ 100 是一个不错的选择。

Hello did you fix the issue?