tseemann / nullarbor

:floppy_disk: :page_with_curl: "Reads to report" for public health and clinical microbiology
GNU General Public License v2.0
134 stars 37 forks source link

cannot read sequence 2 from input file #217

Closed thushandesilva closed 5 years ago

thushandesilva commented 5 years ago

Sorry, I'm new to nullarbor and having issues with the 2nd pair of read files being recognised. I get the following error:

[09:57:38] Hello ubuntu [09:57:38] This is nullarbor.pl 2.0.20181010 [09:57:38] Send complaints to Torsten Seemann [09:57:38] Scanning --ref for problematic sequence IDs... [09:57:38] Using reference genome: /home/ubuntu/GAS_analysis/H293.dna '09:57:38] ERROR: Isolate 'SW101A' - can not read sequence #2 of 1 files: '/home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW101A_S1_L001_R2_001.fastq.gz

The input file is organised as below:

SW101A /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW101A_S1_L001_R1_001.fastq.gz /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW101A_S1_L001_R2_001.fastq.gz SW106E /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW106E_S3_L001_R1_001.fastq.gz /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW106E_S3_L001_R2_001.fastq.gz SW110J /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW110J_S2_L001_R1_001.fastq.gz /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW110J_S2_L001_R2_001.fastq.gz SW113F /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW113F_S4_L001_R1_001.fastq.gz /home/ubuntu/data_1/GAS_Gambia_Genomes_1/SW113F_S4_L001_R2_001.fastq.gz

Both R1 and R2 are in the same location and gzipped.

tseemann commented 5 years ago

Q1. The input.tab file needs to be TAB separated. Have you accidently used spaces?

Q2. Can you paste the output of this command:

ls -lsa /home/ubuntu/data_1/GAS_Gambia_Genomes_1/
thushandesilva commented 5 years ago

Thank you. No, I hadn’t used spaces but had created the file on my Mac and uploaded. Which seemed to have modified one field. I created it using a text editor on the GVL and it now appears to be running ok. Thanks!

tseemann commented 5 years ago

mac2unix file.txt solves this for me. (and dos2unix for windows files)

peflanag commented 4 years ago

Hey guys, this seems so hit and miss at times! I've tried many combinations of mac2unix dos2unix -c mac etc but on some samples it just doesnt work!

I make the file in excel on my mac, save as a tab-deliminated file and then use it on ubuntu but on some runs I just get the error [12:11:01] Hello peter [12:11:01] This is nullarbor.pl 2.0.20191013 [12:11:01] Send complaints to Torsten Seemann [12:11:01] Scanning --ref for problematic sequence IDs... [12:11:01] Using reference genome: /home/peter/Desktop/NCCP11945_GC.gb [12:11:01] ERROR: Isolate '446823' - can not read sequence #1 of 1 files: '/Users/peterflanagan/IMRL_Sequencing/Sinead_GC/Run_1/Fastq/446823_S4_L001_R1_001.fastq.gz' (nullarbor) peter@ubuntu:~$

Is there an easier way to make a tab file thats universal?

Cheers

peflanag commented 4 years ago

also, I ran od -a on my input file and there is no cr nl at the end of the lines . But some of the ends of the lines are missing nl and have ht. Whats ht and how do I get it to nl?

Cheers.

@tseemann @ajmerritt1 sorry to tag ye but ye know alot more than me and google hasnt helpped! Probably cause I don't know how to ask the specific question!

(nullarbor) peter@ubuntu:~$ dos2unix '/home/peter/Desktop/GC_Saab_IDs.txt' dos2unix: converting file /home/peter/Desktop/GC_Saab_IDs.txt to Unix format... (nullarbor) peter@ubuntu:~$ od -a '/home/peter/Desktop/GC_SaabIDs.txt' 0000000 4 4 6 8 2 3 ht / U s e r s / p e 0000020 t e r f l a n a g a n / I M R L 0000040 S e q u e n c i n g / S i n e 0000060 a d G C / R u n 1 / F a s t 0000100 q / 4 4 6 8 2 3 S 4 L 0 0 1 0000120 R 1 0 0 1 . f a s t q . g z 0000140 ht / U s e r s / p e t e r f l a 0000160 n a g a n / I M R L S e q u e 0000200 n c i n g / S i n e a d G C / 0000220 R u n 1 / F a s t q / 4 4 6 8 0000240 2 3 S 4 L 0 0 1 R 2 0 0 0000260 1 . f a s t q . g z nl 4 5 2 5 9 0000300 6 ht / U s e r s / p e t e r f l 0000320 a n a g a n / I M R L S e q u 0000340 e n c i n g / S i n e a d G C 0000360 / R u n 1 / F a s t q / 4 5 2 0000400 5 9 6 S 5 L 0 0 1 R 1 0 0000420 0 1 . f a s t q . g z ht / U s e 0000440 r s / p e t e r f l a n a g a n 0000460 / I M R L S e q u e n c i n g 0000500 / S i n e a d G C / R u n 1 0000520 / F a s t q / 4 5 2 5 9 6 S 5 0000540 L 0 0 1 R 2 0 0 1 . f a s 0000560 t q . g z nl 4 5 2 6 0 3 ht / U s 0000600 e r s / p e t e r f l a n a g a 0000620 n / I M R L S e q u e n c i n 0000640 g / S i n e a d G C / R u n 0000660 1 / F a s t q / 4 5 2 6 0 3 S 0000700 3 L 0 0 1 R 1 0 0 1 . f a 0000720 s t q . g z ht / U s e r s / p e 0000740 t e r f l a n a g a n / I M R L 0000760 S e q u e n c i n g / S i n e 0001000 a d G C / R u n 1 / F a s t 0001020 q / 4 5 2 6 0 3 S 3 L 0 0 1 0001040 R 2 0 0 1 . f a s t q . g z 0001060 nl 4 8 2 6 8 8 ht / U s e r s / p 0001100 e t e r f l a n a g a n / I M R 0001120 L S e q u e n c i n g / S i n 0001140 e a d G C / R u n 1 / F a s 0001160 t q / 4 8 2 6 8 8 S 2 L 0 0 0001200 1 R 1 0 0 1 . f a s t q . g 0001220 z ht / U s e r s / p e t e r f l 0001240 a n a g a n / I M R L S e q u 0001260 e n c i n g / S i n e a d G C 0001300 / R u n 1 / F a s t q / 4 8 2 0001320 6 8 8 S 2 L 0 0 1 R 2 0 0001340 0 1 . f a s t q . g z nl 4 8 2 7 0001360 0 9 ht / U s e r s / p e t e r f 0001400 l a n a g a n / I M R L S e q 0001420 u e n c i n g / S i n e a d G 0001440 C / R u n 1 / F a s t q / 4 8 0001460 2 7 0 9 S 1 L 0 0 1 R 1 0001500 0 0 1 . f a s t q . g z ht / U s 0001520 e r s / p e t e r f l a n a g a 0001540 n / I M R L S e q u e n c i n 0001560 g / S i n e a d G C / R u n 0001600 1 / F a s t q / 4 8 2 7 0 9 S 0001620 1 L 0 0 1 R 2 0 0 1 . f a 0001640 s t q . g z nl 4 8 3 9 4 9 ht / U 0001660 s e r s / p e t e r f l a n a g 0001700 a n / I M R L S e q u e n c i 0001720 n g / S i n e a d G C / R u n 0001740 1 / F a s t q / 4 8 3 9 4 9 0001760 S 9 L 0 0 1 R 1 0 0 1 . f 0002000 a s t q . g z ht / U s e r s / p 0002020 e t e r f l a n a g a n / I M R 0002040 L S e q u e n c i n g / S i n 0002060 e a d G C / R u n 1 / F a s 0002100 t q / 4 8 3 9 4 9 S 9 L 0 0 0002120 1 R 2 0 0 1 . f a s t q . g 0002140 z nl 4 8 3 9 5 0 ht / U s e r s / 0002160 p e t e r f l a n a g a n / I M 0002200 R L S e q u e n c i n g / S i 0002220 n e a d G C / R u n 1 / F a 0002240 s t q / 4 8 3 9 5 0 S 8 L 0 0002260 0 1 R 1 0 0 1 . f a s t q . 0002300 g z ht / U s e r s / p e t e r f 0002320 l a n a g a n / I M R L S e q 0002340 u e n c i n g / S i n e a d G 0002360 C / R u n 1 / F a s t q / 4 8 0002400 3 9 5 0 S 8 L 0 0 1 R 2 0002420 0 0 1 . f a s t q . g z nl 5 0 0 0002440 4 3 3 ht / U s e r s / p e t e r 0002460 f l a n a g a n / I M R L S e 0002500 q u e n c i n g / S i n e a d 0002520 G C / R u n 1 / F a s t q / 5 0002540 0 0 4 3 3 S 6 L 0 0 1 R 1 0002560 0 0 1 . f a s t q . g z ht / U 0002600 s e r s / p e t e r f l a n a g 0002620 a n / I M R L S e q u e n c i 0002640 n g / S i n e a d G C / R u n 0002660 1 / F a s t q / 5 0 0 4 3 3 0002700 S 6 L 0 0 1 R 2 0 0 1 . f 0002720 a s t q . g z nl 5 1 3 5 7 2 ht / 0002740 U s e r s / p e t e r f l a n a 0002760 g a n / I M R L S e q u e n c 0003000 i n g / S i n e a d G C / R u 0003020 n 1 / F a s t q / 5 1 3 5 7 2 0003040 S 7 L 0 0 1 R 1 0 0 1 . 0003060 f a s t q . g z ht / U s e r s / 0003100 p e t e r f l a n a g a n / I M 0003120 R L S e q u e n c i n g / S i 0003140 n e a d G C / R u n 1 / F a 0003160 s t q / 5 1 3 5 7 2 S 7 L 0 0003200 0 1 R 2 0 0 1 . f a s t q . 0003220 g z nl 8 4 3 9 5 0 - 2 ht / U s e 0003240 r s / p e t e r f l a n a g a n 0003260 / I M R L S e q u e n c i n g 0003300 / S i n e a d G C / R u n 1 0003320 / F a s t q / 8 4 3 9 5 0 - 2 0003340 S 1 0 L 0 0 1 R 1 0 0 1 . 0003360 f a s t q . g z ht / U s e r s / 0003400 p e t e r f l a n a g a n / I M 0003420 R L S e q u e n c i n g / S i 0003440 n e a d G C / R u n 1 / F a 0003460 s t q / 8 4 3 9 5 0 - 2 S 1 0 0003500 L 0 0 1 R 2 0 0 1 . f a s 0003520 t q . g z nl M 1 9 - 1 1 9 1 ht / 0003540 U s e r s / p e t e r f l a n a 0003560 g a n / I M R L S e q u e n c 0003600 i n g / S i n e a d G C / R u 0003620 n 2 / F a s t q / M 1 9 - 1 1 0003640 9 1 S 2 3 L 0 0 1 R 1 0 0003660 0 1 . f a s t q . g z ht / U s e 0003700 r s / p e t e r f l a n a g a n 0003720 / I M R L S e q u e n c i n g 0003740 / S i n e a d G C / R u n 2 0003760 / F a s t q / M 1 9 - 1 1 9 1 0004000 S 2 3 L 0 0 1 R 2 0 0 1 . 0004020 f a s t q . g z nl M 1 9 - 4 4 3 0004040 1 3 ht / U s e r s / p e t e r f 0004060 l a n a g a n / I M R L S e q 0004100 u e n c i n g / S i n e a d G 0004120 C / R u n 2 / F a s t q / M 1 0004140 9 - 4 4 3 1 3 6 S 8 L 0 0 1 0004160 R 1 0 0 1 . f a s t q . g z 0004200 ht / U s e r s / p e t e r f l a 0004220 n a g a n / I M R L S e q u e 0004240 n c i n g / S i n e a d G C / 0004260 R u n 2 / F a s t q / M 1 9 - 0004300 4 4 3 1 3 6 S 8 L 0 0 1 R 0004320 2 0 0 1 . f a s t q . g z nl M 0004340 1 9 - 4 4 3 6 9 2 ht / U s e r s 0004360 / p e t e r f l a n a g a n / I 0004400 M R L S e q u e n c i n g / S 0004420 i n e a d G C / R u n 2 / F 0004440 a s t q / M 1 9 - 4 4 3 6 9 2 0004460 S 7 L 0 0 1 R 1 0 0 1 . f 0004500 a s t q . g z ht / U s e r s / p 0004520 e t e r f l a n a g a n / I M R 0004540 L S e q u e n c i n g / S i n 0004560 e a d G C / R u n 2 / F a s 0004600 t q / M 1 9 - 4 4 3 6 9 2 S 7 0004620 L 0 0 1 R 2 0 0 1 . f a s 0004640 t q . g z nl M 1 9 - 4 8 1 3 4 6 0004660 ht / U s e r s / p e t e r f l a 0004700 n a g a n / I M R L S e q u e 0004720 n c i n g / S i n e a d G C / 0004740 R u n 2 / F a s t q / M 1 9 - 0004760 4 8 1 3 4 6 S 9 L 0 0 1 R 0005000 1 0 0 1 . f a s t q . g z ht / 0005020 U s e r s / p e t e r f l a n a 0005040 g a n / I M R L S e q u e n c 0005060 i n g / S i n e a d G C / R u 0005100 n 2 / F a s t q / M 1 9 - 4 8 0005120 1 3 4 6 S 9 L 0 0 1 R 2 0005140 0 0 1 . f a s t q . g z nl M 1 9 0005160 - 4 8 2 9 4 5 ht / U s e r s / p 0005200 e t e r f l a n a g a n / I M R 0005220 L S e q u e n c i n g / S i n 0005240 e a d G C / R u n 2 / F a s 0005260 t q / M 1 9 - 4 8 2 9 4 5 S 1 0005300 0 L 0 0 1 R 1 0 0 1 . f a 0005320 s t q . g z ht / U s e r s / p e 0005340 t e r f l a n a g a n / I M R L 0005360 S e q u e n c i n g / S i n e 0005400 a d G C / R u n 2 / F a s t 0005420 q / M 1 9 - 4 8 2 9 4 5 S 1 0 0005440 L 0 0 1 R 2 0 0 1 . f a s 0005460 t q . g z nl M 1 9 - 5 1 0 1 6 1 0005500 ht / U s e r s / p e t e r f l a 0005520 n a g a n / I M R L S e q u e 0005540 n c i n g / S i n e a d G C / 0005560 R u n 2 / F a s t q / M 1 9 - 0005600 5 1 0 1 6 1 S 3 L 0 0 1 R 0005620 1 0 0 1 . f a s t q . g z ht / 0005640 U s e r s / p e t e r f l a n a 0005660 g a n / I M R L S e q u e n c 0005700 i n g / S i n e a d G C / R u 0005720 n 2 / F a s t q / M 1 9 - 5 1 0005740 0 1 6 1 S 3 L 0 0 1 R 2 0005760 0 0 1 . f a s t q . g z nl M 1 9 0006000 - 5 1 0 1 6 6 ht / U s e r s / p 0006020 e t e r f l a n a g a n / I M R 0006040 L S e q u e n c i n g / S i n 0006060 e a d G C / R u n 2 / F a s 0006100 t q / M 1 9 - 5 1 0 1 6 6 S 4 0006120 L 0 0 1 R 1 0 0 1 . f a s 0006140 t q . g z ht / U s e r s / p e t 0006160 e r f l a n a g a n / I M R L 0006200 S e q u e n c i n g / S i n e a 0006220 d G C / R u n 2 / F a s t q 0006240 / M 1 9 - 5 1 0 1 6 6 S 4 L 0006260 0 0 1 R 2 0 0 1 . f a s t q 0006300 . g z nl M 1 9 - 5 1 0 1 6 7 ht / 0006320 U s e r s / p e t e r f l a n a 0006340 g a n / I M R L S e q u e n c 0006360 i n g / S i n e a d G C / R u 0006400 n 2 / F a s t q / M 1 9 - 5 1 0006420 0 1 6 7 S 5 L 0 0 1 R 1 0006440 0 0 1 . f a s t q . g z ht / U s 0006460 e r s / p e t e r f l a n a g a 0006500 n / I M R L S e q u e n c i n 0006520 g / S i n e a d G C / R u n 0006540 2 / F a s t q / M 1 9 - 5 1 0 1 0006560 6 7 S 5 L 0 0 1 R 2 0 0 0006600 1 . f a s t q . g z nl M 1 9 - 5 0006620 1 6 5 3 8 ht / U s e r s / p e t 0006640 e r f l a n a g a n / I M R L 0006660 S e q u e n c i n g / S i n e a 0006700 d G C / R u n 2 / F a s t q 0006720 / M 1 9 - 5 1 6 5 3 8 S 6 L 0006740 0 0 1 R 1 0 0 1 . f a s t q 0006760 . g z ht / U s e r s / p e t e r 0007000 f l a n a g a n / I M R L S e 0007020 q u e n c i n g / S i n e a d 0007040 G C / R u n 2 / F a s t q / M 0007060 1 9 - 5 1 6 5 3 8 S 6 L 0 0 0007100 1 R 2 0 0 1 . f a s t q . g 0007120 z nl M 2 0 - 0 0 1 3 ht / U s e r 0007140 s / p e t e r f l a n a g a n / 0007160 I M R L S e q u e n c i n g / 0007200 S i n e a d G C / R u n 2 / 0007220 F a s t q / M 2 0 - 0 0 1 3 S 0007240 2 4 L 0 0 1 R 1 0 0 1 . f 0007260 a s t q . g z ht / U s e r s / p 0007300 e t e r f l a n a g a n / I M R 0007320 L S e q u e n c i n g / S i n 0007340 e a d G C / R u n 2 / F a s 0007360 t q / M 2 0 - 0 0 1 3 S 2 4 0007400 L 0 0 1 R 2 0 0 1 . f a s t 0007420 q . g z nl M 2 0 - 1 9 4 3 5 9 3 0007440 ht / U s e r s / p e t e r f l a 0007460 n a g a n / I M R L S e q u e 0007500 n c i n g / S i n e a d G C / 0007520 R u n 2 / F a s t q / M 2 0 - 0007540 1 9 4 3 5 9 3 S 1 1 L 0 0 1 0007560 R 1 0 0 1 . f a s t q . g z 0007600 ht / U s e r s / p e t e r f l a 0007620 n a g a n / I M R L S e q u e 0007640 n c i n g / S i n e a d G C / 0007660 R u n 2 / F a s t q / M 2 0 - 0007700 1 9 4 3 5 9 3 S 1 1 L 0 0 1 0007720 R 2 0 0 1 . f a s t q . g z 0007740 nl M 2 0 - 2 0 9 1 3 5 6 r p t ht 0007760 / U s e r s / p e t e r f l a n 0010000 a g a n / I M R L S e q u e n 0010020 c i n g / S i n e a d G C / R 0010040 u n 2 / F a s t q / M 2 0 - 2 0010060 0 9 1 3 5 6 r p t S 1 L 0 0 0010100 1 R 1 0 0 1 . f a s t q . g 0010120 z ht / U s e r s / p e t e r f l 0010140 a n a g a n / I M R L S e q u 0010160 e n c i n g / S i n e a d G C 0010200 / R u n 2 / F a s t q / M 2 0 0010220 - 2 0 9 1 3 5 6 r p t S 1 L 0010240 0 0 1 R 2 0 0 1 . f a s t q 0010260 . g z nl M 2 0 - 2 1 1 0 1 3 3 ht 0010300 / U s e r s / p e t e r f l a n 0010320 a g a n / I M R L S e q u e n 0010340 c i n g / S i n e a d G C / R 0010360 u n 2 / F a s t q / M 2 0 - 2 0010400 1 1 0 1 3 3 S 2 L 0 0 1 R 0010420 1 0 0 1 . f a s t q . g z ht / 0010440 U s e r s / p e t e r f l a n a 0010460 g a n / I M R L S e q u e n c 0010500 i n g / S i n e a d G C / R u 0010520 n 2 / F a s t q / M 2 0 - 2 1 0010540 1 0 1 3 3 S 2 L 0 0 1 R 2 0010560 _ 0 0 1 . f a s t q . g z 0010575