Open zgb963 opened 9 months ago
I also encountered this problem, have you solved it?
Hi, I'm also having the same issue; any advice?
I also encountered this problem, have you solved it?
@yeeus not yet, I heard from someone that liftoff needs to be run with a gtf file and not a gff file. So I tried that but I got the following error 'GFF does not contain any gene features. Use -f to provide a list of other feature types to lift over.'
We'll look into this - but Liftoff usually runs in no more than an hour or two on a mammalian genome, so if it's running for many hours something is wrong. It doesn't need that much memory. However it seems you are lifting human annotation onto Rhesus macaque, which is pretty distant from human (at the DNA level). This means that minimap2 will likely have trouble mapping many genes. You might instead try our newer LiftOn program, which is designed for more distant mapping problems. It uses Liftoff as a module, and also miniprot. Check it out here: https://github.com/Kuanhao-Chao/LiftOn/blob/main/README.md https://github.com/Kuanhao-Chao/LiftOn/blob/main/README.md
Hello,
I've been having issues running Liftoff. It's taking days to run and then terminates. I'm running it on an HPC environment using 100GB memory and a computer node that has 2000 cores. The below command is what I'm using to run liftoff. The target genome is rhemac10 FASTA and I've also inputed the human genome hg38 FASTA and human genome annotation GFF.
Here is the bsub command I used to submit my script
And here is my script
However, it has been running for several days and it's stuck on lifting features.
extracting features 2024-01-23 11:57:09,016 - INFO - Populating features 2024-01-23 12:04:20,319 - INFO - Populating features table and first-order relations: 4900134 features 2024-01-23 12:04:20,319 - INFO - Updating relations 2024-01-23 12:05:01,905 - INFO - Creating relations(parent) index 2024-01-23 12:05:05,589 - INFO - Creating relations(child) index 2024-01-23 12:05:10,210 - INFO - Creating features(featuretype) index 2024-01-23 12:05:14,158 - INFO - Creating features (seqid, start, end) index 2024-01-23 12:05:19,103 - INFO - Creating features (seqid, start, end, strand) index 2024-01-23 12:05:24,253 - INFO - Running ANALYZE features aligning features [M::main::16.3110.41] loaded/built the index for 2939 target sequence(s) [M::mm_mapopt_update::17.5900.45] mid_occ = 596 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 2939 [M::mm_idx_stat::18.3710.48] distinct minimizers: 101324913 (39.04% are singletons); average occurrences: 5.469; average spacing: 5.362; total length: 2971331530 [M::worker_pipeline::226.3593.67] mapped 10628 sequences [M::worker_pipeline::382.8163.79] mapped 10362 sequences [M::worker_pipeline::555.9683.84] mapped 12280 sequences [M::worker_pipeline::711.785*3.85] mapped 14834 sequences [M::main] Version: 2.26-r1175 [M::main] CMD: minimap2 -o intermediate_files/reference_all_to_target_all.sam -a --end-bonus 5 --eqx -N 50 -p 0.5 -t 32 liftoff/rheMac10.fa.gz.mmi intermediate_files/reference_all_genes.fa [M::main] Real time: 712.151 sec; CPU: 2743.497 sec; Peak RSS: 27.401 GB lifting feature
Am I using enough memory or cores/threads for liftoff? Is there a typical runtime for lifting over features from one large genome to another?