gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
269 stars 34 forks source link

provide separate gff and fasta files as input for panaroo #297

Closed sun-qibo closed 5 months ago

sun-qibo commented 5 months ago

Hello, I read in the manual that it is possible to provide separate gff and fasta files as input for panaroo per isolate by providing each file delimited by a space or a tab. If I understand well, it means I create a input txt file as such:

genome1.gff     genome1.fna
genome2.gff     genome2.fna
...

However, I got the error below:

$ panaroo -i input_file_test.txt -o results --clean-mode strict pre-processing gff3 files... 0%| | 0/1 [00:00Sequence ID not found in Fasta! NODE_60_length_99090_cov_13.5122 0%| | 0/1 [00:01 Error reading prokka input! joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 436, in _process_worker r = call_item() File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 288, in call return self.fn(*self.args, self.kwargs) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in call return self.func(*args, *kwargs) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/parallel.py", line 262, in call return [func(args, kwargs) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/parallel.py", line 262, in return [func(*args, **kwargs) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/panaroo/prokka.py", line 214, in get_gene_sequences else: raise ValueError("Invalid gene sequence!") ValueError: Invalid gene sequence! """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/panaroo/prokka.py", line 304, in process_prokka_input gene_sequence_list = Parallel(n_jobs=n_cpu)( File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/parallel.py", line 1056, in call self.retrieve() File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/parallel.py", line 935, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result return future.result(timeout=timeout) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/concurrent/futures/_base.py", line 444, in result return self.__get_result() File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception ValueError: Invalid gene sequence!

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/sunqb/miniconda3/envs/ai_env/bin/panaroo", line 8, in sys.exit(main()) File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/panaroo/main.py", line 325, in main process_prokka_input(args.input_files, args.output_dir, File "/home/sunqb/miniconda3/envs/ai_env/lib/python3.8/site-packages/panaroo/prokka.py", line 316, in process_prokka_input raise RuntimeError("Error reading prokka input!") RuntimeError: Error reading prokka input!

in panaroo/prokka.py, line 133 get_gene_sequences, I don't see any checking process to see if the list provided is one gff3 per isolate or separate gff and fasta. am I missing something?

So does panaroo really support separate gff and fasta files?

Thank you in advance!

gtonkinhill commented 5 months ago

Hi,

It looks like some of your annotations may not match the standard Prokka GFF3 output format. You can discard these annotations using the --remove-invalid-genes flag.

sun-qibo commented 5 months ago

thank you. actually it was my mistake, I used gene sequences instead of genome sequences. now i changed the file name to genome files it works perfectly.