spacegraphcats / spacegraphcats

Indexing & querying large assembly graphs -- in space, no one can hear you miao!
https://spacegraphcats.github.io/spacegraphcats/
Other
116 stars 15 forks source link

ProtectedOutputException: Write-protected output files for rule bcalm_catlas_prepare_input: GCF_000765235.1_k31/bcalm.unitigs.db #445

Open taylorreiter opened 2 years ago

taylorreiter commented 2 years ago
ProtectedOutputException in line 199 of /home/tereiter/github/2020-ibd/.snakemake/conda/7218b88cfb8b4e4
e84f86d335f04be9e/lib/python3.8/site-packages/spacegraphcats/conf/Snakefile:
Write-protected output files for rule bcalm_catlas_prepare_input:
GCF_000765235.1_k31/bcalm.unitigs.db
  File "/home/tereiter/github/2020-ibd/.snakemake/conda/7218b88cfb8b4e4e84f86d335f04be9e/lib/python3.8/
site-packages/snakemake/executors/__init__.py", line 136, in run_jobs
  File "/home/tereiter/github/2020-ibd/.snakemake/conda/7218b88cfb8b4e4e84f86d335f04be9e/lib/python3.8/
site-packages/snakemake/executors/__init__.py", line 441, in run
  File "/home/tereiter/github/2020-ibd/.snakemake/conda/7218b88cfb8b4e4e84f86d335f04be9e/lib/python3.8/
site-packages/snakemake/executors/__init__.py", line 230, in _run
  File "/home/tereiter/github/2020-ibd/.snakemake/conda/7218b88cfb8b4e4e84f86d335f04be9e/lib/python3.8/
site-packages/snakemake/executors/__init__.py", line 155, in _run
This is spacegraphcats 2.1.dev6+g4f23e9a.
ctb commented 2 years ago

what was the rule and/or command line?

thanks :)

taylorreiter commented 2 years ago

rule bcalm_catlas_prepare_input:

Invoked with

python -m spacegraphcats build {input.conf} --outdir={params.outdir} --rerun-incomplete --nolock

With conf file

catlas_base: GCF_000765235.1
input_sequences:
- outputs/sgc_genome_queries_hardtrim/GCF_000765235.1.hardtrim.fa.gz
radius: 10
paired_reads: true
ctb commented 2 years ago

ok. I think the problem is that the database should be deleted before that rule is run.

ctb commented 2 years ago

I was able to replicate with:

python -m spacegraphcats build dory-test --outdir=dory-foo --rerun-incomplete --nolock
touch dory-foo/dory_k21/bcalm.inputlist.txt
python -m spacegraphcats build dory-test --outdir=dory-foo --rerun-incomplete --nolock

I'm 99% certain that what's going on is that the output db = f"{cdbg_dir}/bcalm.unitigs.db" already exists and is explicitly marked as protected, so snakemake isn't overwriting it.

The short-term fix is to remove that file.

The long-term fix might be to remove protected status, but ...I'm resistant, because I think I put this in place to avoid Very Large Files being overwritten too easily by sgc. So it might be better to have this be a hiccough that you need to manually overcome.

taylorreiter commented 2 years ago

Yes plz don't remove write protections here. I think we should leave this issue open so a solution is searchable. Indeed, I deleted the file and restarted spacegraphcats and it ran perfectly.