Closed nsheff closed 5 years ago
Related issue:
refgenie getseq -g hg38 -l chr1:5-10
Traceback (most recent call last):
File "/home/nsheff/.local/lib/python3.5/site-packages/pyfaidx/__init__.py", line 359, in __init__
if mutable else 'rb')
IsADirectoryError: [Errno 21] Is a directory: '/ext/yeti/refgenomes/hg38/fasta/default'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/nsheff/.local/bin/refgenie", line 10, in <module>
sys.exit(main())
File "/home/nsheff/.local/lib/python3.5/site-packages/refgenie/refgenie.py", line 659, in main
refgenie_getseq(rgc, args.genome, args.locus)
File "/home/nsheff/.local/lib/python3.5/site-packages/refgenie/refgenie.py", line 506, in refgenie_getseq
fa = pyfaidx.Fasta(rgc.get_asset(genome, "fasta"))
File "/home/nsheff/.local/lib/python3.5/site-packages/pyfaidx/__init__.py", line 996, in __init__
build_index=build_index)
File "/home/nsheff/.local/lib/python3.5/site-packages/pyfaidx/__init__.py", line 368, in __init__
"Cannot read FASTA file %s" % filename)
pyfaidx.FastaNotFoundError: Cannot read FASTA file /ext/yeti/refgenomes/hg38/fasta/default
getseq appears to be seeking for fasta
instead of fasta.fasta
.
I think we should make fasta
work as it used to
We might do that.
It is the question of what's the behavior a user expects. I've explicitly implemented it this way:
my reasoning was: if my asset is a dir (we decided to point to it with .
) I can refer to it with no seek_key
, otherwise (if it is a set of files where each of them has a separate seek_key defined) I have to specify the one I'm referring to explicitly
I don't see a disadvantage of:
if the asset has seek keys, and a seek key is defined with the same name as the asset, then that is the default seek key returned if no key is provided.
you could still accomplish what you are proposing by putting in a self-named seek key with pointer to the folder.
but with your method you cannot do what we want to do in most cases, which is point to a file without a seek key, even when seek keys are defined (like in the case of fasta
, a good example)
that's working now, but shouldn't it list the non-keyed version in the asset list?
Local assets:
hg38/ fasta.chrom_sizes:default, fasta.fai:default, fasta.fasta:default
rCRSd/ bowtie2_index:default, fasta.chrom_sizes:default, fasta.fai:default, fasta.fasta:default
I think it should just say " fasta:default
instead of fasta.fasta:default
...
in other words, fasta.fasta
should not be a thing... it should just be fasta
.
fixed
[mstolarczyk@MichalsMBP test_genomes]: refgenie list -c genomes.yaml
Local genomes: mouse_chrM2x, rCRSd
Local recipes: fasta, bowtie2_index, bwa_index, hisat2_index, bismark_bt2_index, bismark_bt1_index, kallisto_index, salmon_index, epilog_index, star_index, gencode_gtf, ensembl_gtf, ensembl_rb, refgene_anno, feat_annotation
Local assets:
mouse_chrM2x/ bowtie2_index:default, fasta.chrom_sizes:default, fasta.fai:default, fasta:default
rCRSd/ bowtie2_index:default, fasta.chrom_sizes:test, fasta.fai:test, fasta:test
Right now seeking for the
fasta
asset doesn't work, because it expects you to typefasta.fasta
:See, the vanilla
fasta
key is just pointing to the folder.Do we want to enable a default when there are keys present? I thought if the name of the seek key matched the asset name, then the repetition shouldn't be required?