TileDB-Inc / TileDB-VCF

Efficient variant-call data storage and retrieval library using the TileDB storage library.
https://tiledb-inc.github.io/TileDB-VCF/
MIT License
83 stars 13 forks source link

can not load bcf files from tutorial #743

Closed bruno-ariano closed 5 days ago

bruno-ariano commented 2 weeks ago

when following the tutorial at the step ds.ingest_samples(sample_uris = batch1_uris) I get the error:

RuntimeError: TileDB-VCF exception: Cannot open VCF file; failed to load BCF index.

leipzig commented 2 weeks ago

hi @bruno-ariano I can't replicate the error (I assume you mean this tutorial)?

can you print(batch1_uris) just to make sure we are on the same page?

bruno-ariano commented 1 week ago

Hi @leipzig, below is the screenshot from the script I have run following the tutorial.

image

print(batch1_uris) give me the following output

['s3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kgp3-chr1/HG00096.bcf', 's3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kgp3-chr1/HG00097.bcf', 's3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kgp3-chr1/HG00099.bcf', 's3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kgp3-chr1/HG00100.bcf', 's3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kgp3-chr1/HG00101.bcf']

leipzig commented 1 week ago

Hi @bruno-ariano Still have not been able to replicate this (even on a GCP VM that has never seen AWS). If you can append the following arguments to this command, it might help.

ds = tiledbvcf.Dataset(uri=array_uri, mode="w", tiledb_config={"vfs.s3.no-sign-request": True,"vfs.s3.region": "us-east-1"})

If that doesn't work we are happy to provide you some free credit on TileDB Cloud where we have that tutorial and others ready to run.

bruno-ariano commented 5 days ago

Thanks @leipzig I think it was a problem with the region I had on my aws