bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
87 stars 17 forks source link

Can't use absolute paths in --create-db (poppunk==2.5.0) #231

Closed bioinformagica closed 1 year ago

bioinformagica commented 1 year ago

Dear developers, I'm i having a little trouble using absolute paths to run --create-db.

When i run with --output ABSPATH i get an error, but the same does not happen when i use RELATIVEPATH like in the tutorial.

I did a quick fix by moving the db after its creation, but i thought it would be cool to report the issue here.

Using abs path (error):

$ poppunk --create-db --output /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db --r-files /home/hugo/projects/reparoma/data/rlist.txt --plot-fit 10 --threads 3
 --min-k 14 --max-k 29
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)                                                                                                                    
        (with backend: sketchlib v2.0.0                                                                                                                                     
         sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 3 threads                                                                                                                  
Mode: Building new database from input sequences                                                                                                                            
Sketching 10 genomes using 3 thread(s)                                                                                                                                      
Progress (CPU): 10 / 10                                                                                                                                                     
Writing sketches to file                                                                                                                                                    
Calculating random match chances using Monte Carlo                                                                                                                          
Calculating distances using 3 thread(s)                                                                                                                                     
Progress (CPU): 100.0%                                                                                                                                                      
Traceback (most recent call last):                                                                                                                                          
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/bin/poppunk", line 11, in <module>                                                   
    sys.exit(main())                                                                                                                                                        
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/__main__.py", line 317, in main                 
    distMat = queryDatabase(rNames = seq_names,                                                                                                                             
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/sketchlib.py", line 559, in queryDatabase       
    plot_fit(klist,                                                                   
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/plot.py", line 126, in plot_fit          
    plt.savefig(out_prefix + ".pdf", bbox_inches='tight')                                                
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/pyplot.py", line 944, in savefig
    res = fig.savefig(*args, **kwargs)                                                
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/figure.py", line 3277, in savefig
    self.canvas.print_figure(fname, **kwargs)                                                            
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 2338, in print_figure
    result = print_method(                                                            
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 2204, in <lambda>    
    print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(                                   
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backends/backend_pdf.py", line 2808, in print_pdf
    file = PdfFile(filename, metadata=metadata)                                                          
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backends/backend_pdf.py", line 713, in __init__
    fh, opened = cbook.to_filehandle(filename, "wb", return_opened=True)                                 
  File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/cbook/__init__.py", line 492, in to_filehandle
    fh = open(fname, flag, encoding=encoding)                                                            
FileNotFoundError: [Errno 2] No such file or directory: '/home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db//home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db_fit_example_1.pdf' 

Using relative path (works):

$ poppunk --create-db --output $(basename /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db) --r-files /home/hugo/projects/reparoma/data/rlist.txt --plot-fit 10 --threads 3 --min-k 14 --max-k 29 && mv -v $(basename /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db) /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db 

PopPUNK (POPulation Partitioning Using Nucleotide Kmers) 
        (with backend: sketchlib v2.0.0
         sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 3 threads
Mode: Building new database from input sequences
Sketching 10 genomes using 3 thread(s)
Progress (CPU): 10 / 10
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 3 thread(s)
Progress (CPU): 100.0%

Done
renamed 's_genus_poppunk_db' -> '/home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db'

obs: i'm using basename to get relative paths bc this CLs are create by snakemake.

poppunk_env.yaml.txt

johnlees commented 1 year ago

Thank you for the detailed report.

This is due to an error here: https://github.com/bacpop/PopPUNK/blob/master/PopPUNK/sketchlib.py#L564 Should be ref_db and not dbPrefix I can fix this in the next release.

If you are able, I would appreciate it if you could re-run the version with absolute paths but omitting the --plot-fit argument, just to double-check this is the only issue.

bioinformagica commented 1 year ago

Hello, thank you for the quick reply !

I can fix this in the next release.

Thanks !

If you are able, I would appreciate it if you could re-run the version with absolute paths but omitting the --plot-fit argument, just to double-check this is the only issue.

Yeah the problem was indeed with the --plot-fit argument, without it the run is completed regardless of the ABSPATH:


$ poppunk --create-db --output /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db --r-files /home/hugo/projects/reparoma/data/rlist.txt --threads 3 --min-k 14 --max-k 29
Activating conda environment: .snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
    (with backend: sketchlib v2.0.0
     sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 3 threads
Mode: Building new database from input sequences
Sketching 10 genomes using 3 thread(s)
Progress (CPU): 10 / 10
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 3 thread(s)
Progress (CPU): 100.0%

Done