Open desmodus1984 opened 1 year ago
Hi,
Is it possible to share the contents of the log file?
How large is the file you are trying to convert to elfasta? (myse_Super_myo_final.fa)
Thanks.
Charlotte
On 28 Feb 2023, at 21:14, desmodus1984 @.**@.>> wrote:
Hi, I am trying to run elprep for variant calling, and I am trying first to convert the reference into elfasta format. I have tried it in too different server, one with the up-to-date version, and other with an older 5.X version and the code elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta
gives me the following error: 2023/02/28 15:07:09 Created log file at /users/PHS0338/jpac1984/logs/elprep/elprep-2023-02-28-15-07-09-560735636-EST.log 2023/02/28 15:07:09 Command line: [elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta] 2023/02/28 15:07:09 bufio.Scanner: token too long
I have tried running elprep within conda and with the latest release and both times I got the same error.
Hope this error can be fixed soon.
Thanks;
— Reply to this email directly, view it on GitHubhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FExaScience%2Felprep%2Fissues%2F68&data=05%7C01%7Ccharlotte.herzeel%40imec.be%7C175a97e3758d4af2e2d208db19c85685%7Ca72d5a7225ee40f09bd1067cb5b770d4%7C0%7C0%7C638132120461209090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gzU0U9U1cqC8pyFtrrWAliRL%2F%2BHy6hmBo6%2BD49gnMP4%3D&reserved=0, or unsubscribehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABPESUWRLVPTZJLVOY4NR4TWZZMAVANCNFSM6AAAAAAVLFWPVM&data=05%7C01%7Ccharlotte.herzeel%40imec.be%7C175a97e3758d4af2e2d208db19c85685%7Ca72d5a7225ee40f09bd1067cb5b770d4%7C0%7C0%7C638132120461209090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IfgREb9EbMOe6yDGi7EW%2F1h9gboBA8%2FIUWYLog9XfE8%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Charlotte,
The log file is basically this:
elprep version 5.1.3 compiled with go1.17.6 - see http://github.com/exascience/elprep for more information.2023/03/01 12:06:04 Created log file at /users/PHS0338/jpac1984/logs/elprep/elprep-2023-03-01-12-06-04-871312972-EST.log2023/03/01 12:06:04 Command line: [elprep fasta-to-elfasta ragtag.356.fasta ragtag.356.elfasta]2023/03/01 12:06:04 bufio.Scanner: token too long This I am trying to convert now is 113Mb. The previous I tried to convert - which I could before and now I can't- is 1.9 GB.
The code I used was this: elprep fasta-to-elfasta ragtag.356.fasta ragtag.356.elfasta
Juan Pablo Aguilar Cabezas
Ecology and Evolutionary Biology Ph.D. Candidate
Department of Biological Sciences
Ohio University, Athens OH
From: Charlotte Herzeel @.> Sent: Wednesday, March 1, 2023 9:14 AM To: ExaScience/elprep @.> Cc: Aguilar Cabezas, Juan Pablo @.>; Author @.> Subject: [External] Re: [ExaScience/elprep] elfasta-conversion error: bufio.Scanner: token too long (Issue #68)
Use caution with links and attachments.
Hi,
Is it possible to share the contents of the log file?
How large is the file you are trying to convert to elfasta? (myse_Super_myo_final.fa)
Thanks.
Charlotte
On 28 Feb 2023, at 21:14, desmodus1984 @.**@.>> wrote:
Hi, I am trying to run elprep for variant calling, and I am trying first to convert the reference into elfasta format. I have tried it in too different server, one with the up-to-date version, and other with an older 5.X version and the code elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta
gives me the following error: 2023/02/28 15:07:09 Created log file at /users/PHS0338/jpac1984/logs/elprep/elprep-2023-02-28-15-07-09-560735636-EST.log 2023/02/28 15:07:09 Command line: [elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta] 2023/02/28 15:07:09 bufio.Scanner: token too long
I have tried running elprep within conda and with the latest release and both times I got the same error.
Hope this error can be fixed soon.
Thanks;
— Reply to this email directly, view it on GitHubhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FExaScience%2Felprep%2Fissues%2F68&data=05%7C01%7Ccharlotte.herzeel%40imec.be%7C175a97e3758d4af2e2d208db19c85685%7Ca72d5a7225ee40f09bd1067cb5b770d4%7C0%7C0%7C638132120461209090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gzU0U9U1cqC8pyFtrrWAliRL%2F%2BHy6hmBo6%2BD49gnMP4%3D&reserved=0, or unsubscribehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABPESUWRLVPTZJLVOY4NR4TWZZMAVANCNFSM6AAAAAAVLFWPVM&data=05%7C01%7Ccharlotte.herzeel%40imec.be%7C175a97e3758d4af2e2d208db19c85685%7Ca72d5a7225ee40f09bd1067cb5b770d4%7C0%7C0%7C638132120461209090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IfgREb9EbMOe6yDGi7EW%2F1h9gboBA8%2FIUWYLog9XfE8%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FExaScience%2Felprep%2Fissues%2F68%23issuecomment-1450222159&data=05%7C01%7Cja569116%40ohio.edu%7C4c494b22b36a47b7ad0c08db1a5f56ab%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C638132769009152211%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xQSOL%2Baiw6pLwUX9duFKWEUXFYrN0%2FxYuwE9XcRX%2ByY%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJWD2VNNSAUXIOIEOCBJ3U3WZ5KWDANCNFSM6AAAAAAVLFWPVM&data=05%7C01%7Cja569116%40ohio.edu%7C4c494b22b36a47b7ad0c08db1a5f56ab%7Cf3308007477c4a70888934611817c55a%7C0%7C0%7C638132769009152211%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=q0zj0FXMU%2B4YsVGy1S%2FS6GYrE0HJozCX91a7asezGa8%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
I've got the same issue:
The reference file is about 750MB
This is all the log reveals:
2023/04/26 11:26:54 Command line: [elprep fasta-to-elfasta /kyukon/scratch/gent/vo/001/gvo00126/vsc44381/refgenomes/bzrefgenome/filtered.asm.cns.fna /kyukon/scratch/gent/vo/001/gvo00126/vsc44381/refgenomes/bzrefgenome/filtered.asm.cns.elfasta] 2023/04/26 11:26:54 bufio.Scanner: token too long
I have solved the issue by limiting the number of characters per line to 80, like my other references:
seqkit seq reference.fasta -w 80
The output fasta file should be accepted by elprep fasta-to-elfasta
Hi all,
I got the same error message when I used elprep vcf-to-elsites. My VCF file contains 1,136 samples; the longest line is over 170,000 characters. I removed the lines with over 60,000 characters from the vcf file, and elprep successfully ran.
VCF files cannot be inserted with line breaks, unlike fasta files. Hope this error can be fixed.
Error log is: elprep version 5.1.3 compiled with go1.17.6 - see http://github.com/exascience/elprep for more information.
2023/06/13 08:32:00 Created log file at /home/stakuno/logs/elprep/elprep-2023-06-13-08-32-00-557298385-JST.log 2023/06/13 08:32:00 Command line: [elprep vcf-to-elsites ./vcf1/all_indel.vcf ./vcf1/all_indel.elsites] 2023/06/13 08:32:00 bufio.Scanner: token too long panic: bufio.Scanner: token too long
goroutine 1 [running]: log.Panic({0xc000129c90, 0xc000129ca0, 0x66c580}) /opt/conda/conda-bld/elprep_1651164620400/_build_env/go/src/log/log.go:354 +0x65 github.com/exascience/elprep/v5/internal.RunPipeline(0xc000228300) /opt/conda/conda-bld/elprep_1651164620400/work/internal/misc.go:34 +0x55 github.com/exascience/elprep/v5/intervals.FromVcfFile({0x7ffcc1fe023a, 0x18}) /opt/conda/conda-bld/elprep_1651164620400/work/intervals/intervals.go:336 +0x432 github.com/exascience/elprep/v5/cmd.VcfToElsites() /opt/conda/conda-bld/elprep_1651164620400/work/cmd/convert.go:47 +0x1a5 main.main() /opt/conda/conda-bld/elprep_1651164620400/work/main.go:75 +0x317
Hello. I have run into a similar error.
$ which go && go version ~/.conda/envs/WES_dev2/bin/go go version go1.17.6 linux/amd64
$ which elprep && elprep ~/.conda/envs/WES_dev2/bin/elprep elprep version 5.1.3 compiled with go1.17.6
The command I used: elprep sfm $sam_path $bam_path \ --output-type bam \ --replace-read-group "ID:group1 LB:1 PL:illumina PU:unit1 SM:sample" \ --mark-duplicates \ --mark-optical-duplicates $dupmetrics \ --remove-duplicates \ --sorting-order coordinate \ --bqsr $cal_tbl_path \ --known-sites $dbsnp_elsites \ --target-regions $bed_file \ --reference $elref_path \ --haplotypecaller $vcf_out \ --timed
Output from my terminal: elprep version 5.1.3 compiled with go1.17.6 - see http://github.com/exascience/elprep for more information.
2024/03/21 10:44:31 Created log file at /home/areese/logs/elprep/elprep-2024-03-21-10-44-31-633572959-EDT.log 2024/03/21 10:44:31 Command line: [elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --remove-duplicates --sorting-order coordinate --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --timed] 2024/03/21 10:44:31 Executing command: elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --optical-duplicates-pixel-distance 100 --remove-duplicates --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --quantize-levels 0 --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --sorting-order coordinate --timed --intermediate-files-output-prefix debug.reads_to_hg38.P14 --intermediate-files-output-type sam 2024/03/21 10:44:31 Splitting... 2024/03/21 10:53:23 Filtering (phase 1)... 2024/03/21 10:53:24 exit status 2
2024/03/21 10:53:23 Created log file at /home/areese/logs/elprep/elprep-2024-03-21-10-53-23-776329743-EDT.log 2024/03/21 10:53:23 Command line: [elprep filter /home/shared/repos/RNA-seq_develop/elprep-splits-f58c8369-975d-439f-8426-5b564ed24d3b/splits/debug.reads_to_hg38.P14-unmapped.sam /home/shared/repos/RNA-seq_develop/elprep-splits-processed-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --remove-duplicates --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --sorting-order coordinate --timed --pg-cmd-line elprep sfm /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/debug.reads_to_hg38.P14.sam /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.processed.bam --output-type bam --replace-read-group ID:group1 LB:1 PL:illumina PU:unit1 SM:sample --mark-duplicates --mark-optical-duplicates /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/reads_to_hg38.P14.markedDuplicatesMetrics.txt --optical-duplicates-pixel-distance 100 --remove-duplicates --bqsr /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/bqsr_recalibration.tbl --reference /home/shared/projects/RNA-seq/references/UCSC/latest/hg38.P14.elfasta --quantize-levels 0 --max-cycle 500 --known-sites /home/shared/databases/dbsnp/Homo_sapiens_assembly38.dbsnp138.elsites --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed --haplotypecaller /home/shared/projects/WES/PilotStudy/ILN2_2024AE/HTB-2D-TR1_elprep/map_reads/HTB-2D-TR1_pilotWES_reads_to_hg38.P14.vcf --sorting-order coordinate --timed --intermediate-files-output-prefix debug.reads_to_hg38.P14 --intermediate-files-output-type sam --bqsr-tables-only /home/shared/repos/RNA-seq_develop/elprep-tabs-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam.elrecal --mark-optical-duplicates-intermediate /home/shared/repos/RNA-seq_develop/elprep-metrics-f58c8369-975d-439f-8426-5b564ed24d3b/debug.reads_to_hg38.P14-unmapped.sam --optical-duplicates-pixel-distance 100 --target-regions /home/shared/projects/WES/hg38_Twist_ILMN_Exome_2.0_Plus_Panel_Combined_Mito.UCSC.bed] 2024/03/21 10:53:24 bufio.Scanner: token too long panic: bufio.Scanner: token too long
goroutine 1 [running]: log.Panic({0xc0001c5458, 0xc002331890, 0xc002333750}) /opt/conda/conda-bld/elprep_1651164620400/_build_env/go/src/log/log.go:354 +0x65 github.com/exascience/elprep/v5/bed.ParseBed({0x7ffd3bb9ba8c, 0xc00011e030}) /opt/conda/conda-bld/elprep_1651164620400/work/bed/bed-files.go:57 +0x56b github.com/exascience/elprep/v5/cmd.Filter() /opt/conda/conda-bld/elprep_1651164620400/work/cmd/filter.go:738 +0x353f main.main() /opt/conda/conda-bld/elprep_1651164620400/work/main.go:67 +0x245
I did not have issues with either the vcf-to-elsites command, nor the fasta-to-elfasta commands. Thanks.
Hi, I am trying to run elprep for variant calling, and I am trying first to convert the reference into elfasta format. I have tried it in too different server, one with the up-to-date version, and other with an older 5.X version and the code elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta
gives me the following error: 2023/02/28 15:07:09 Created log file at /users/PHS0338/jpac1984/logs/elprep/elprep-2023-02-28-15-07-09-560735636-EST.log 2023/02/28 15:07:09 Command line: [elprep fasta-to-elfasta myse_Super_myo_final.fa myse_Super_myo_final.elfasta] 2023/02/28 15:07:09 bufio.Scanner: token too long
I have tried running elprep within conda and with the latest release and both times I got the same error.
Hope this error can be fixed soon.
Thanks;