JiaxinYangJX / FLAMINGOrLite

Lite version of FLAMINGO, High-resolution 3D chromosome structures reconstruction based on Hi-C
3 stars 3 forks source link

Error: 2L not found #3

Open NMaziak opened 8 months ago

NMaziak commented 8 months ago

Hello, I have a similar issue as previously found on Flamingor

If I input this:

res = flamingo_main_noura(hic_data="matrix.hic",
                    file_format='hic',
                    domain_res=1e6,
                    frag_res=1e4,
                    chr_name="chr2L",
                    normalization='KR',
                    nThread=20)

I get the error, I suspect it comes from the gsub in the function construct_obj_from_hic

function (hic_file, resolution, chr_name, normalization) 
{
    library(strawr)
    options(scipen = 999)
    chr_number <- gsub("chr", "", chr_name)
    normalized_data = strawr::straw(normalization, hic_file, 
        chr_number, chr_number, unit = "BP", binsize = resolution)
    n <- max(normalized_data[, 2])/resolution + 1
    i_ind <- (normalized_data[, 1]/resolution) + 1
    j_ind <- (normalized_data[, 2]/resolution) + 1
    input_if = Matrix::sparseMatrix(i = i_ind, j = j_ind, x = normalized_data[, 
        3], dims = c(n, n))
    res = new("flamingo", IF = input_if, n_frag = n, chr_name = chr_name)
    return(res)
}

My hic was alligned to ucsc, so there is "chr". When I run: strawr::readHicChroms("matrix.hic") I get as expected:

  index  name   length
1     0   All   137547
2     1 chr2L 23513712
3     2 chr2R 25286936
4     3 chr3L 28110227
5     4 chr3R 32079331
6     5  chr4  1348131
7     6  chrX 23542271
8     7  chrY  3667352

If the gsub is not necessary it would be great to be able to run the tool.

Thanks again, Noura

tycheleturner commented 8 months ago

I have the code running on a dataset with "chr" in the chromosome name. Here's what I did:

  1. Downloaded the code from this GitHub repository using the Code -> Download Zip file image

  2. Unzipped the code

    unzip FLAMINGOrLite-master.zip
  3. Modified the R code in the package to not have the gsub. It's in the FLAMINGOrLite-master/R/data_utils.R file. Here's the change (commented out the old, the new line is below):

  #chr_number <- gsub("chr","",chr_name)
   chr_number <- chr_name
  1. Rebuilt the package

    R CMD BUILD FLAMINGOrLite-master
  2. Reinstalled the package

    R CMD INSTALL FLAMINGOrLite_0.0.0.9000.tar.gz
  3. Now running the code on my dataset it runs as follows:

    
    library(FLAMINGOrLite)

res = flamingo_main(hic_data='inter_30.hic', file_format='hic', domain_res=1e6, frag_res=5e3, chr_name='chr19',normalization='NONE',nThread=2)



Note: that the run hasn't finished yet but it's been running for about 10 minutes so far. 
tycheleturner commented 7 months ago

Update, that it did successfully run.