rfael0cm / RTIGER

3 stars 4 forks source link

error: 'names' attribute [3] must be the same length as the vector [1] #14

Open ncsumaize opened 7 months ago

ncsumaize commented 7 months ago

I get this error after model fitting (I think), GenotypePlots are created and then the error pops, and the object with genotype calls is not returned. Any suggestions on how to work around this error?

library(RTIGER) library(Gviz) Loading required package: grid library(rtracklayer) setupJulia(JULIA_HOME = "C:/Users/jholland/AppData/Local/Programs/Julia-1.9.4/bin") Julia version 1.9.4 at location C:\Users\jholland\AppData\Local\Programs\JULIA-~1.4\bin will be used. Loading setup script for JuliaCall... Finish loading setup script for JuliaCall. WARNING: Your Julia version is different than 1.0.5. We recommend to use 1.0.5 to improve speeed. Using other versions might give problems or do not work on higher speed.> sourceJulia()

file_paths = list.files("Q:/My Drive/tsi_new/rtiger/popB1", pattern = "\.chr1\.txt$", full.names = TRUE) file_paths = file_paths[c(1, 25, 76, 97)]

Get sample names

sampleIDs <- sub('\.chr1.txt$', '', basename(file_paths))

Create the expDesign object

expDesign = data.frame(files=file_paths, name=sampleIDs)

give chromosome lengths

chr_len = c(308452471) names(chr_len) = "chr1" myres = RTIGER(expDesign = expDesign,

  • outputdir = "popB1/output",

  • outputdir = "Q:/My Drive/tsi_new/rtiger/popB1/output",
  • seqlengths = chr_len,
  • rigidity = 20,
  • nstates=2, #for BC1 population
  • save.results = TRUE) Loading data and generating RTIGER object. Using 2 states for fitting. [1] "Loading file: Q:/My Drive/tsi_new/rtiger/popB1/Early_W22_X_Chalco_teo._101-1source_of_Mtsi429-1.chr1.txt" [1] "Loading file: Q:/My Drive/tsi_new/rtiger/popB1/ftsi_x_Mtsi+__extracted_from_Teo._Col._101 JKP8874x8884-106_465-2.chr1.txt" [1] "Loading file: Q:/My Drive/tsi_new/rtiger/popB1/P8Sibs_of_Population_B_P8962-26.chr1.txt" [1] "Loading file: Q:/My Drive/tsi_new/rtiger/popB1/W22_ftsi_isolate_A_462-1.chr1.txt"

Fitting the parameters and Viterbi decoding. post processing value is: TRUE R value autotune is: FALSE Number of iterations run: 30

Plotting samples Genotypes. PLotting CO number per chromosome. Error in names(x) <- value : 'names' attribute [3] must be the same length as the vector [1]

rfael0cm commented 7 months ago

Hi Jim! this error is because nstates = 2. I just realized that I have not tested as deeply as I thought other states than 3. I will work on it and get back to you. Meanwhile, you can get the results by using the option save.results = F . The results that are saved are just the plots. Once I have figured out where the problem is I will get back to you. Let me know if you see any other problems. :-)

ncsumaize commented 7 months ago

I may have introduced another problem that may be unrelated. I interpreted this statement in the data input format: "But the interpretation of references allele and alternate allele is completely arbitrary and it is the user who defines them." to mean that REF/ALT alleles do not necessarily correspond to parent_1 vs parent_2 AND can swap back and forth among markers. Now I am guessing my interpretation is wrong. I think it means REF can be either parent_1 OR parent_2 but ALL SNPs should have REF allele correspond to one parent.

In other words, I have two parents aligned to a third distinct line as reference genome. That means at some sites in my VCF parent_1 has REF allele and at other sites parent_2 has REF allele. But if RTIGER expects REF to always correspond to one parent, then it will look like I have just a random mess of haplotypes and probably everything is going to fail even if I can get an output.

So, I am now testing if updating the input files to make REF consistently refer to allele count from one parent will give a sensible output. Can you advise if my new interpretation of REF/ALT is correct?

rfael0cm commented 7 months ago

Hi Jim, that is correct. One parent should always be the reference, but this is up to the user to decide which one. This way, every time RTIGER infers a region as homozygous 1, it means that it belongs to the reference parent you decided upon. Does it make sense?

ncsumaize commented 7 months ago

Yes, the assignment of REF calls to one parental haplotype makes sense. I fixed my input files to correct that problem, but the original error listed at the start of this thread remains. At the moment I am working around the original error by flagging out the graphics output sections of mainfun.R, so I can get the fitted RTIGER object returned.