qiime2 / qiime2

Official repository for the QIIME 2 framework.
https://qiime2.org
BSD 3-Clause "New" or "Revised" License
470 stars 238 forks source link

Artifact.data should be lazy #10

Closed jairideout closed 8 years ago

jairideout commented 8 years ago

Currently when instantiating an Artifact from a tar file, the artifact's data is loaded into memory and stored at the data property. The data property should be lazy such that the data is only loaded when .data is first accessed.

Question: should this always load a new instance of data, or cache the result? This depends on how Artifact is expected to be interacted with, which is unclear to me right now.

antgonza commented 8 years ago

Currently in Qiita an artifact is simply a file location or entry in the DB with a format type. For example, an artifact can be raw fastq files (raw_forward/reverse/barcode_fastq) or a biom. However, you can only apply split_libraries to raw formats and beta diversity to bioms.

ebolyen commented 8 years ago

@antgonza, our artifacts aren't too different, but this is in regards to the actual object and it's API.

antgonza commented 8 years ago

OK, thanks. Then I think everything, wherever possible, should be lazy. I'm concern about know issues like trying to load all sequences.

wasade commented 8 years ago

Agree that lazy is ideal On Mar 8, 2016 11:06 AM, "Antonio Gonzalez" notifications@github.com wrote:

OK, thanks. Then I think everything, wherever possible, should be lazy. I'm concern about know issues like trying to load all sequences.

— Reply to this email directly or view it on GitHub https://github.com/biocore/qiime2/issues/10#issuecomment-193919666.

jairideout commented 8 years ago

Artifact is now lazy and views of the data can be created using Artifact.view.