ComparativeGenomicsToolkit / Comparative-Annotation-Toolkit

Apache License 2.0
158 stars 48 forks source link

AugustusCGP error #244

Open francicco opened 3 years ago

francicco commented 3 years ago

Hi,

Me again. I'm getting this:

ERROR: 2021-02-14 18:13:48,059 - [pid 3967] Worker Worker(salt=024135178, workers=28, host=bc4login2.bc4.acrc.priv, username=tk19812, pid=20263) failed    Task: EvaluateDriverTask for Djun
Traceback (most recent call last):
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 191, in run
    new_deps = self._run_get_new_deps()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 133, in _run_get_new_deps
    task_gen = self.task.run()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/__init__.py", line 2083, in run
    results = classify(eval_args)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/classify.py", line 83, in classify
    ec_df = evaluation_classify(aln_mode, ref_tx_dict, tx_dict, tx_biotype_map, psl_iter, seq_dict)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/classify.py", line 123, in evaluation_classify
    r.extend(find_indels(tx, psl, aln_mode))
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/classify.py", line 300, in find_indels
    row = parse_indel(left_pos, right_pos, coordinate_fn, tx, q_offset, 'Insertion')
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/classify.py", line 276, in parse_indel
    name=''.join([indel_type, gap_type]))
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/transcripts.py", line 167, in get_bed
    if new_start != exon_intervals[0].start:
IndexError: list index out of range

:( F

francicco commented 3 years ago

Any help with this? :(

francicco commented 3 years ago

I think it might be due to the annotation F

ifiddes commented 3 years ago

HI Francesco,

Sorry for the delays in looking at this. I don't think this is a problem with your annotation, I think this is a problem with the recent changes to the indel detection code. It is hitting an edge case when trying to slice out the coordinates of the detected indel.

Can you send me the following files (similar to what you have done before):

  1. workdir/transMap/Djun.filtered.psl
  2. workdir/transMap/Djun.filtered.gp
ifiddes commented 3 years ago

This error isn't in augustusCGP, it is in transMap (the indel classification code doesn't run against CGP predictions, because they don't have a reference transcript to compare to)

francicco commented 3 years ago

This error isn't in augustusCGP, it is in transMap (the indel classification code doesn't run against CGP predictions, because they don't have a reference transcript to compare to)

Yes, I realised it after trying to run CAT only with transmap.

Here are the files, thanks a lot. F

Djun.filtered.gp.gz

Djun.filtered.psl.gz

ifiddes commented 3 years ago

Sorry, I realized there are two files I am missing.

  1. workdir/transcript_alignment/Djun.transMap.mRNA.psl
  2. workdir/transcript_alignment/Djun.transMap.CDS.psl
francicco commented 3 years ago

I had to delete the folder, I restated yesterday ... as soon as I have it I'll send it over. Sorry! F

francicco commented 3 years ago

Hi @ifiddes, the files are here

https://drive.google.com/drive/folders/1Hwt0Le0gNED6TMIXTz3bCTvwzdLbVQrl?usp=sharing

francicco commented 3 years ago

Hi Ian, did you get the files? F

ifiddes commented 3 years ago

Hi Francesco,

Sorry for the delay on evaluating this. I just ran your files through that code block on the current master branch and no errors occurred. What commit / release are you on?

francicco commented 3 years ago

I can try to install it again from the master. I'll let you know. F

francicco commented 3 years ago

I rerun it:

INFO: 2021-03-01 10:12:55,754 - Filtering transMap PSL for Djun.
ERROR: 2021-03-01 10:13:14,320 - [pid 24696] Worker Worker(salt=656852402, workers=28, host=highmem16.bc4.acrc.priv, username=tk19812, pid=16464) failed    Task: FilterTransMap for Haoe
Traceback (most recent call last):
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 191, in run
    new_deps = self._run_get_new_deps()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 133, in _run_get_new_deps
    task_gen = self.task.run()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/__init__.py", line 1293, in run
    json_target)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/filter_transmap.py", line 166, in filter_transmap
    filter_overlapping_genes)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/filter_transmap.py", line 352, in filter_clusters
    collapsed_df = pd.DataFrame(collapsed_genes, columns=['GeneId', 'CollapsedGeneIds', 'CollapsedGeneNames'])
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pandas/core/frame.py", line 490, in __init__
    mgr = init_dict({}, index, columns, dtype=dtype)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 239, in init_dict
    val = construct_1d_arraylike_from_scalar(np.nan, len(index), nan_dtype)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1449, in construct_1d_arraylike_from_scalar
    dtype = dtype.dtype
AttributeError: type object 'object' has no attribute 'dtype'

And it seems a different error F

ifiddes commented 3 years ago

Hmm, odd. This seems to be a pandas error, that seems to be introduced by a new version.

Can you do this pip install pandas==1.1.0 then re-run?

francicco commented 3 years ago
[tk19812@bc4login3 Comparative-Annotation-Toolkit]$ source venv/bin/activate
(venv) [tk19812@bc4login3 Comparative-Annotation-Toolkit]$ pip install pandas==1.1.0
Collecting pandas==1.1.0
  Downloading pandas-1.1.0-cp37-cp37m-manylinux1_x86_64.whl (10.5 MB)
     |████████████████████████████████| 10.5 MB 14.6 MB/s 
Requirement already satisfied: python-dateutil>=2.7.3 in ./venv/lib/python3.7/site-packages (from pandas==1.1.0) (2.8.1)
Requirement already satisfied: numpy>=1.15.4 in ./venv/lib/python3.7/site-packages (from pandas==1.1.0) (1.20.1)
Requirement already satisfied: pytz>=2017.2 in ./venv/lib/python3.7/site-packages (from pandas==1.1.0) (2021.1)
Requirement already satisfied: six>=1.5 in ./venv/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas==1.1.0) (1.15.0)
Installing collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.0.0
    Uninstalling pandas-1.0.0:
      Successfully uninstalled pandas-1.0.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cat 2.0 requires pandas==1.0, but you have pandas 1.1.0 which is incompatible.
Successfully installed pandas-1.1.0

I'll rerun CAT... I don't need to reinstall CAT, right? F

ifiddes commented 3 years ago

No, you shouldn't need to change anything. Although, it is odd that you actually upgraded pandas there...

If the error happens again, we will need to look at the input files to filter transMap. Odd that this is happening suddenly now.

francicco commented 3 years ago

Ok, now transmap works again. I'm gonna try to add AugustusCGP with the UTR (--augustus-utr-on) F

francicco commented 3 years ago

I'm getting many of these errors:

ERROR: 2021-03-02 11:21:43,843 - [pid 13391] Worker Worker(salt=144150971, workers=28, host=highmem12.bc4.acrc.priv, username=tk19812, pid=12662) failed    ToilTask: AugustusDriverTask for Eisa using batchSystem single_machine
Traceback (most recent call last):
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 191, in run
    new_deps = self._run_get_new_deps()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/luigi/worker.py", line 133, in _run_get_new_deps
    task_gen = self.task.run()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/__init__.py", line 1465, in run
    tools.misc.convert_gtf_gp(out_gp, out_gtf)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/misc.py", line 53, in convert_gtf_gp
    procOps.run_proc(cmd, stdout=outf)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/procOps.py", line 73, in run_proc
    pl.wait()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1127, in wait
    self.raiseIfExcept()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1085, in raiseIfExcept
    p.raiseIfExcept()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 749, in raiseIfExcept
    raise self.exceptInfo[0].with_traceback(self.exceptInfo[2])
tools.pipeline.ProcException: process exited 255: gtfToGenePred -genePredExt /mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap.outDir/augustus/Eisa.augTMR.gtf /dev/stdout

It seems like the only thing I can do is using TransMap... :( F

francicco commented 3 years ago

Now because of Panda validate_gff3 isn't working anymore... :(

(venv) [tk19812@bc4login3 Annotation]$ validate_gff3 Junonia_coenia_JC_v1.0.gff3
Traceback (most recent call last):
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 584, in _build_master
    ws.require(__requires__)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 901, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 792, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (pandas 1.1.0 (/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages), Requirement.parse('pandas==1.0'), {'cat'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/bin/validate_gff3", line 4, in <module>
    __import__('pkg_resources').require('cat==2.0')
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 3253, in <module>
    @_call_aside
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 3237, in _call_aside
    f(*args, **kwargs)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 3266, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 586, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 599, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 787, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'pandas==1.0' distribution was not found and is required by cat
diekhans commented 3 years ago

if you can make the file /mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap.outDir/augustus/Eisa.augTMR.gtf

available via, we can determine why gtfToGenePred is not working

Francesco Cicconardi notifications@github.com writes:

tools.pipeline.ProcException: process exited 255: gtfToGenePred -genePredExt /mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap.outDir/augustus/Eisa.augTMR.gtf /dev/stdout



It seems like the only thing I can do is using TransMap... :(
francicco commented 3 years ago

That file does not exist. F

francicco commented 3 years ago

@diekhans, as a side note, etraining of your Augustus version gives me a segmental fault on long genes. F

diekhans commented 3 years ago

My fix is merged into the augustus master source; although I don't know if a release has been made.

If you can produce a test case outside of cactus, it can be debugged.

Francesco Cicconardi notifications@github.com writes:

@diekhans, as a side note, etraining of your Augustus version gives me a segmental fault on long genes. F

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit/issues/244#issuecomment-789712222 @diekhans, as a side note, etraining of your Augustus version gives me a segmental fault on long genes. F

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.*

diekhans commented 3 years ago

I mean outside of CAT.

francicco commented 3 years ago

Augustus-3.3.3 works, I guess there will be a problem using CAT+Augustus F

diekhans commented 3 years ago

This is probably a bug unrelated to what I did. My change is newer to than 3.3.3 and is not yet in a release.

I would suggest trying the master at https://github.com/Gaius-Augustus/Augustus. If it still fails, someone can debug it.

Francesco Cicconardi notifications@github.com writes:

Augustus-3.3.3 works, I guess there will be a problem using CAT+Augustus

francicco commented 3 years ago

I'm honestly a bit confused about all the Augustus releases out there. But basically, I don't know if you remember the release you fixed for the bug related to CAT, well, the etraining of that release throw a segmental fault for genes longer to ~15kb. F

diekhans commented 3 years ago

Well, my change is:

Fixed check for --speciesfilenames removing all species. Improve various error message around missing species

So unless it is really weird, I think I broke it, I built on something already broken.

Hence my suggestion to go with the current master, it has my change. Maybe the etraining bug is fixed, maybe it isn't. If it isn't, then we can create a ticket if we can make a test case outside of CAT.

Francesco Cicconardi notifications@github.com writes:

I'm honestly a bit confused about all the Augustus releases out there. But basically, I don't know if you remember the release you fixed for the bug related to CAT, well, the etraining of that release throw a segmental fault for genes longer to ~15kb. F

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit/issues/244#issuecomment-790020481 I'm honestly a bit confused about all the Augustus releases out there. But basically, I don't know if you remember the release you fixed for the bug related to CAT, well, the etraining of that release throw a segmental fault for genes longer to ~15kb. F

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.*

francicco commented 3 years ago

the current is the 3.4.0 and I can't compile it.

/mnt/storage/easybuild/software/binutils/2.26-GCCcore-5.4.0/bin/ld.gold: error: cannot find -lmysqlclient
randseqaccess.o:randseqaccess.cc:function MysqlAccess::getSeq(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int, Strand): error: undefined reference to 'mysqlpp::Query::str[abi:cxx11](mysqlpp::SQLQueryParms&)'
randseqaccess.o:randseqaccess.cc:function MysqlAccess::getSeq(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int, Strand): error: undefined reference to 'mysqlpp::Query::str[abi:cxx11](mysqlpp::SQLQueryParms&)'
randseqaccess.o:randseqaccess.cc:function MysqlAccess::getSeq(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int, Strand): error: undefined reference to 'mysqlpp::SQLTypeAdapter::SQLTypeAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)'
randseqaccess.o:randseqaccess.cc:function MysqlAccess::getFeatures(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int, Strand): error: undefined reference to 'mysqlpp::Query::str[abi:cxx11](mysqlpp::SQLQueryParms&)'
randseqaccess.o:randseqaccess.cc:function MysqlAccess::getFeatures(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int, Strand): error: undefined reference to 'mysqlpp::SQLTypeAdapter::SQLTypeAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)'
randseqaccess.o:randseqaccess.cc:function mysqlpp::UseQueryResult::~UseQueryResult(): error: undefined reference to 'mysql_free_result'
randseqaccess.o:randseqaccess.cc:function mysqlpp::UseQueryResult::~UseQueryResult(): error: undefined reference to 'mysql_free_result'
randseqaccess.o:randseqaccess.cc:function mysqlpp::RefCountedPointer<st_mysql_res, mysqlpp::RefCountedPointerDestroyer<st_mysql_res> >::~RefCountedPointer(): error: undefined reference to 'mysql_free_result'
randseqaccess.o:randseqaccess.cc:function void populate_genomes<(mysqlpp::sql_dummy_type)0>(genomes*, mysqlpp::Row const&): error: undefined reference to 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > mysqlpp::String::conv<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) const'
randseqaccess.o:randseqaccess.cc:function void populate_genomes<(mysqlpp::sql_dummy_type)0>(genomes*, mysqlpp::Row const&): error: undefined reference to 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > mysqlpp::String::conv<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) const'
randseqaccess.o:randseqaccess.cc:function void populate_genomes<(mysqlpp::sql_dummy_type)0>(genomes*, mysqlpp::Row const&): error: undefined reference to 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > mysqlpp::String::conv<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) const'
randseqaccess.o:randseqaccess.cc:function void populate_hints<(mysqlpp::sql_dummy_type)0>(hints*, mysqlpp::Row const&): error: undefined reference to 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > mysqlpp::String::conv<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) const'
randseqaccess.o:randseqaccess.cc:function int MysqlAccess::get_region_coord<assembly>(int, int, int, std::vector<assembly, std::allocator<assembly> >&): error: undefined reference to 'mysqlpp::Query::str[abi:cxx11](mysqlpp::SQLQueryParms&)'
randseqaccess.o:randseqaccess.cc:function int MysqlAccess::get_region_coord<assembly>(int, int, int, std::vector<assembly, std::allocator<assembly> >&): error: undefined reference to 'mysqlpp::SQLTypeAdapter::SQLTypeAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)'
collect2: error: ld returned 1 exit status
make[1]: *** [augustus] Error 1
make[1]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/src'
make: *** [all] Error 2

But it's not related to this ticket... It's a mess, I keep finding bugs and it's quite frustrating :(

F

diekhans commented 3 years ago

Francesco Cicconardi notifications@github.com writes:

the current is the 3.4.0 and I can't compile it.

you need to disable mysql support

But it's not related to this ticket...

if you are interested in fixing this, start another ticket ..

It's a mess, I keep finding bugs and it's quite frustrating :(

yea, funding agencies want to pay for new stuff and not support, programmers are overpaid, we all try to do too much instead of make the current better.

francicco commented 3 years ago

The compilation went fine. I'm waiting CAT with cgp to finish its run... Afterward I'll be ready to run my 63 genome alignment with 60 annotations... I hope is gonna be fine! Thanks

F

francicco commented 3 years ago

Now I'm getting this

ERROR: 2021-03-08 13:50:58,923 - Got exit code 1 (indicating failure) from job _toil_worker JobFunctionWrappingJob file:/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/toil/augustus/Diul/jobStore kind-JobFunctionWrappingJob/v/instance-to5zp3j7.
WARNING: 2021-03-08 13:50:58,923 - Job failed with exit value 1: 'JobFunctionWrappingJob' kind-JobFunctionWrappingJob/v/instance-to5zp3j7
WARNING: 2021-03-08 13:50:58,925 - The job seems to have left a log file, indicating failure: 'JobFunctionWrappingJob' kind-JobFunctionWrappingJob/v/instance-to5zp3j7
WARNING: 2021-03-08 13:50:58,925 - Log from job kind-JobFunctionWrappingJob/v/instance-to5zp3j7 follows:
=========>
        [2021-03-08T13:50:55+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
        [2021-03-08T13:50:55+0000] [MainThread] [I] [toil] Running Toil version 5.0.0-f182c6420554b258632a40bfa47a8f69e56675e4 on host highmem11.bc4.acrc.priv.
        [2021-03-08T13:50:55+0000] [MainThread] [I] [toil.worker] Working on job 'JobFunctionWrappingJob' kind-JobFunctionWrappingJob/v/instance-to5zp3j7
        [2021-03-08T13:50:55+0000] [MainThread] [I] [luigi-interface] Loaded ['luigi.cfg']
        [2021-03-08T13:50:57+0000] [MainThread] [I] [toil.worker] Loaded body Job('JobFunctionWrappingJob' kind-JobFunctionWrappingJob/v/instance-to5zp3j7) from description 'JobFunctionWrappingJob' kind-JobFunctionWrappingJob/v/instance-to5zp3j7
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-bc182bf44f8342bf9a5d07a81ffd63d8/Diul.fa' to path '/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/genome.fasta'
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-6fdd1576eb9e48f79306800d92ea7ddc/Diul.fa.gdx' to path '/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/genome.fasta.gdx'
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-0abf16121ac54ef49db14105ac0c8dd6/Diul.fa.flat' to path '/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/genome.fasta.flat'
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-6279ad83edf4418c8e005a3f66beda6d/extrinsic.ETM2.cfg' to path '/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/tmpu5275rb2.tmp'
        [2021-03-08T13:50:58+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-dc6f6fd6806a44d3ae6c124038936d7b/hints.db' to path '/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/tmp_bj5x3ld.tmp'
        Traceback (most recent call last):
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/toil/worker.py", line 393, in workerScript
            job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/toil/job.py", line 2358, in _runner
            returnValues = self._run(jobGraph=None, fileStore=fileStore)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/toil/job.py", line 2279, in _run
            return self.run(fileStore)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/venv/lib/python3.7/site-packages/toil/job.py", line 2502, in run
            rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/augustus.py", line 144, in run_augustus_chunk
            args.utr)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/cat/augustus.py", line 173, in run_augustus
            aug_output = tools.procOps.call_proc_lines(cmd)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/procOps.py", line 62, in call_proc_lines
            out = call_proc(cmd)
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/procOps.py", line 51, in call_proc
            pl.wait()
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1127, in wait
            self.raiseIfExcept()
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1085, in raiseIfExcept
            p.raiseIfExcept()
          File "/mnt/storage/scratch/tk19812/software/Comparative-Annotation-Toolkit/tools/pipeline.py", line 749, in raiseIfExcept
            raise self.exceptInfo[0].with_traceback(self.exceptInfo[2])
        tools.pipeline.ProcException: process signaled: SIGSEGV: augustus /mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.6061603868.tmp --predictionStart=-8001537 --predictionEnd=-8001537 --extrinsicCfgFile=/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/tmpu5275rb2.tmp --hintsfile=/mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.1574159266.tmp --UTR=1 --alternatives-from-evidence=0 --species=Heliconius_melpomene2.5 --allow_hinted_splicesites=atac --protein=0 --softmasking=1 --/augustus/verbosity=0
        [2021-03-08T13:50:58+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host highmem11.bc4.acrc.priv
<=========

F

diekhans commented 3 years ago

cause is: SIGSEGV: augustus /mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.6061603868.tmp --predictionStart=-8001537 --predictionEnd=-8001537 --extrinsicCfgFile=/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/tmpu5275rb2.tmp --hintsfile=/mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.1574159266.tmp --UTR=1 --alternatives-from-evidence=0 --species=Heliconius_melpomene2.5 --allow_hinted_splicesites=atac --protein=0 --softmasking=1 --/augustus/verbosity=0 tools.pipeline.ProcException: process signaled: SIGSEGV: augustus /mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.6061603868.tmp --predictionStart=-8001537 --predictionEnd=-8001537 --extrinsicCfgFile=/mnt/storage/home/tk19812/scratch/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/tmpu5275rb2.tmp --hintsfile=/mnt/storage/scratch/tk19812/HeliconiniiProject/HeliconGenomeAlignmentAnnotation/Test.ANN.pipeline.Cactus.CAT.Hmel.TransMap+CGP.outDir/node-59d31032-21b4-477d-adcc-725ae64b7b57-f875fcd4-94c4-4749-8811-46221e3f890d/tmpqpm0xbjp/a27151f1-9993-4391-b5dc-3c972b474900/highmem11.bc4.acrc.priv.16439.1574159266.tmp --UTR=1 --alternatives-from-evidence=0 --species=Heliconius_melpomene2.5 --allow_hinted_splicesites=atac --protein=0 --softmasking=1 --/augustus/verbosity=0

If you can get the files and reproduce it on the command line, then we can debug it.

francicco commented 3 years ago

Hi @diekhans,

I guess these are the files:

highmem11.bc4.acrc.priv.16439.6061603868.tmp.txt tmpu5275rb2.tmp.txt highmem11.bc4.acrc.priv.16439.1574159266.tmp.txt

F

francicco commented 3 years ago

Hi @diekhans Any update about that? Cheers F

diekhans commented 3 years ago

not yet, very backed up right now. I will make sure I can reproduce it.

diekhans commented 3 years ago

looks like I need: Augustus/config/species/Heliconius_melpomene2.5/Heliconius_melpomene2.5_parameters.cfg.

francicco commented 3 years ago

Heliconius_melpomene2.5_parameters.cfg.txt

diekhans commented 3 years ago

Now it wants: Heliconius_melpomene2.5_weightmatrix.txt

francicco commented 3 years ago

Heliconius_melpomene2.5.zip

diekhans commented 3 years ago

as usual, we need the input files to determine the cause.

francicco commented 3 years ago

Didn't I send you already? F

diekhans commented 3 years ago

Oh sorry, I think I reply to a old message.

So I checked out master HEAD of https://github.com/Gaius-Augustus/Augustus.git

and it runs fine with your example. are you using this?

francicco commented 3 years ago

No, I'm using augustus 3.4.0 F

diekhans commented 3 years ago

Please build the head of augustus. Remember to disasble building mysql extensions. Note, you only have to build the augustus program, if the programs fail to compile, just use the existing version.

I use: make -j $(nproc) -k MYSQL=false

and ignore failure of bam2hints

francicco commented 3 years ago

I'm confused. Which version should I use? F

diekhans commented 3 years ago

Head of the tree using git, mine does report AUGUSTUS (3.4.0), however when you are checking out from the tree rather that using a release, the version is not accurate.

If you are already doing this, then try running this command with the data you gave me:

../Augustus/bin/augustus data/highmem11.bc4.acrc.priv.16439.6061603868.tmp \ --predictionStart=-8001537 --predictionEnd=-8001537 \ --extrinsicCfgFile=data/tmpu5275rb2.tmp \ --hintsfile=data/highmem11.bc4.acrc.priv.16439.1574159266.tmp --UTR=1 \ --alternatives-from-evidence=0 --species=Heliconius_melpomene2.5 \ --allow_hinted_splicesites=atac --protein=0 --softmasking=1 --/augustus/verbosity=0

It does not crash for me; if it does for you, we have some kind of environment difference.

francicco commented 3 years ago

Your version works the official 3.4.0 fails. The problem with your version is etraining, which fails with long genes; if you haven't do anything recently.

F

diekhans commented 3 years ago

When you say "my version", do you mean my fork at

https://github.com/diekhans/Augustus

or the official Augustus repo at

https://github.com/Gaius-Augustus/Augustus

???

you should not use diekhans/Augustus, it is now merged into the official repo.

If long genes still fails in the official repo, lets put together a test case and see if myself or Mario can fix it.

Francesco Cicconardi @.***> writes:

Your version works the official 3.4.0 fails. The problem with your version is etraining, which fails with long genes; if you haven't do anything recently.

francicco commented 3 years ago

Yes, your version is the https://github.com/diekhans/Augustus

and the one with the segmental fault is the Gaius-Augustus 3.4.0

I tried to compile it again:

[tk19812@bc4login2 augustus-3.4.0]$ make -j $(nproc) -k MYSQL=false
mkdir -p bin
cd src && make
make[1]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/src'
echo "-Wall -Wno-sign-compare -pedantic -O3 -std=c++11  -DZIPINPUT -DCOMPGENEPRED -DTESTING -DM_SQLITE" > cxxflags
make[1]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/src'
cd auxprogs && make
make[1]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs'
cd aln2wig; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/aln2wig'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: `aln2wig' is up to date.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/aln2wig'
cd bam2hints; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/bam2hints'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: `bam2hints' is up to date.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/bam2hints'
cd compileSpliceCands; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/compileSpliceCands'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: `compileSpliceCands' is up to date.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/compileSpliceCands'
cd filterBam; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/filterBam'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
(cd src;make)
make[3]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/filterBam/src'
Makefile:59: warning: overriding recipe for target `BIN'
Makefile:20: warning: ignoring old recipe for target `BIN'
Makefile:62: warning: overriding recipe for target `CHECKBAM'
Makefile:23: warning: ignoring old recipe for target `CHECKBAM'
Makefile:70: warning: overriding recipe for target `filterBam'
Makefile:31: warning: ignoring old recipe for target `filterBam'
Makefile:73: warning: overriding recipe for target `filterBam.o'
Makefile:34: warning: ignoring old recipe for target `filterBam.o'
Makefile:73: warning: overriding recipe for target `MatePairs.o'
Makefile:34: warning: ignoring old recipe for target `MatePairs.o'
Makefile:73: warning: overriding recipe for target `getReferenceName.o'
Makefile:34: warning: ignoring old recipe for target `getReferenceName.o'
Makefile:73: warning: overriding recipe for target `initOptions.o'
Makefile:34: warning: ignoring old recipe for target `initOptions.o'
Makefile:73: warning: overriding recipe for target `SingleAlignment.o'
Makefile:34: warning: ignoring old recipe for target `SingleAlignment.o'
Makefile:73: warning: overriding recipe for target `printElapsedTime.o'
Makefile:34: warning: ignoring old recipe for target `printElapsedTime.o'
Makefile:73: warning: overriding recipe for target `sumMandIOperations.o'
Makefile:34: warning: ignoring old recipe for target `sumMandIOperations.o'
Makefile:73: warning: overriding recipe for target `sumDandIOperations.o'
Makefile:34: warning: ignoring old recipe for target `sumDandIOperations.o'
Makefile:73: warning: overriding recipe for target `PairednessCoverage.o'
Makefile:34: warning: ignoring old recipe for target `PairednessCoverage.o'
Makefile:77: warning: overriding recipe for target `clean'
Makefile:38: warning: ignoring old recipe for target `clean'
g++     -std=c++0x  filterBam.o MatePairs.o getReferenceName.o initOptions.o SingleAlignment.o printElapsedTime.o sumMandIOperations.o sumDandIOperations.o PairednessCoverage.o -o filterBam /mnt/storage/home/tk19812/scratch/software/bamtools/usr/local/include/bamtools/../../lib64/libbamtools.a -lz 
filterBam compiled with BAMTOOLS=/mnt/storage/home/tk19812/scratch/software/bamtools/usr/local/include/bamtools
mv filterBam ../../../bin/filterBam
make[3]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/filterBam/src'
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/filterBam'
cd homGeneMapping; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/homGeneMapping'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
(cd src; make)
make[3]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/homGeneMapping/src'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/homGeneMapping/src'
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/homGeneMapping'
cd joingenes; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/joingenes'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/joingenes'
cd utrrnaseq/Debug; make all;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/utrrnaseq/Debug'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/utrrnaseq/Debug'
cd bam2wig; make;
make[2]: Entering directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/bam2wig'
make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make rule.
make[2]: `bam2wig' is up to date.
make[2]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs/bam2wig'
make[1]: Leaving directory `/mnt/storage/scratch/tk19812/software/augustus-3.4.0/auxprogs'

And it isn't working F

diekhans commented 3 years ago

try it without the -j $(nproc)