arendsee / fagin

Classify genes using a syntenic filter
GNU General Public License v3.0
0 stars 0 forks source link

error when reading synteny map #22

Open lijing28101 opened 4 years ago

lijing28101 commented 4 years ago

When fagin loading synteny map, an error occurred as below:

    Get synteny map file from config

N197> "function (.)
{
    con@input@syn[[target]]
}"
[1] "GlymaLeeGm01_glyma.Wm82.V4.4PTR.Gm01.syn"

N198> "funnel(., seqinfo_a = rmonad::view(., "seqinfo", focal), seqinfo_b = rmonad::view(.,
    "seqinfo", target))"
N199> "synder::read_synmap"
 * ERROR: invalid class “GRanges” object: 'seqnames(x)' contains missing values

 -----------------

[[1]]
[1] "GlymaLeeGm01_glyma.Wm82.V4.4PTR.Gm01.syn"

$seqinfo_a
Seqinfo object with 1 sequence from Glycine_max_v4 genome:
  seqnames             seqlengths isCircular         genome
  glyma.Wm82.gnm4.Gm01   57932355         NA Glycine_max_v4

$seqinfo_b
Seqinfo object with 1 sequence from Glycine_max_Lee genome:
  seqnames            seqlengths isCircular          genome
  glyma.Lee.gnm1.Gm01   58711475         NA Glycine_max_Lee

 *** FAILURE ***

But if read synteny map directly using synder::read_synmap, there is no problem.

arendsee commented 4 years ago

The error message says:

ERROR: invalid class “GRanges” object: 'seqnames(x)' contains missing value

This means there is something wrong with the input GRanges object.

How exactly are you running synder::read_synmap? Did you give it the synteny map filename or the GRanges object? Also did you pass it the Seqinfo objects describing both the genomes?

You can see what Seqinfo objects are being passed to read_synmap by extracting them from the Rmonad object.

seqinfo_a = rmonad::esc(rmonad::view(m, "seqinfo", focal))
seqinfo_b = rmonad::esc(rmonad::view(m, "seqinfo", target))
synder::read_synmap(synfile, seqinfo_a=seqinfo_a, seqinfo_b=seqinfo_b)

Where focal and target are variables stored in the function environment. Tell me the output of this code.

Also, can you make a small, reproducible case where this code fails? If I can't replicate the error, then I can't debug it.

lijing28101 commented 4 years ago

When I run synder::read_synmap(synfile), there is no error. But the error occurred after adding seqinfo_a and seqinfo_b. There may be some error in Granges, but I didn't find any error before the step read_synmap using get_error.

Here is my script:

library(fagin)
library(rmonad)
library(knitr)
library(magrittr)
library(readr)

get_soybean_config2 <- function(){
  con <- fagin::config()
  con@archive = "glymax-archive2"
  con@synder@offsets = c(1L,1L)
  con@synder@trans = "p" # proportion transform mummer
  con@alignment@dna2dna_maxspace = 1e8L
  con@input@focal_species = "Glycine_max_v4"
  con@input@gff <- list(
    "Glycine_max_v4"   = "test.Wm.gff"
    , "Glycine_max_Lee"  = "test.Lee.gff"
  )
  con@input@fna <- list(
    "Glycine_max_v4"   = "test.Wm.fna"
    , "Glycine_max_Lee"  = "test.Lee.fna"
  )
  con@input@syn <- list(
    "Glycine_max_Lee" = "test.syn"
  )
  con@input@tree <- "tree.2"
  con@input@query_gene_list <- "test.orphan.txt"
  con@input@control_gene_list <- "test.nonorphan.txt"
  fagin::validate_config(con)
  con
}

con2 <- get_soybean_config2()
{
  as_monad(get_species(con2)) %>%
    rmonad::loop(
      FUN = fagin:::load_species,
      con = con2
    ) %>>%

    {
      load_gene_list(con2@input@query_gene_list)
    } %>% rmonad::tag("query_genes") %>>%

    {
      load_gene_list(con2@input@control_gene_list)
    } %>% rmonad::tag("control_genes")

} -> m2

seqinfo_a = rmonad::esc(rmonad::view(m2, "seqinfo", focal))

seqinfo_b = rmonad::esc(rmonad::view(m2, "seqinfo", target))

test <- synder::read_synmap("test.syn", seqinfo_a=seqinfo_a, seqinfo_b=seqinfo_b)

I attached the test input files, only the first 100,000 bases of both focal and target, which produce the same error as the original files.

test.zip