Open taiya opened 4 years ago
cc @Conchylicultor
More info here would be helpful.
Is it that you'd like for the filepath to be different? What do you mean by the "self-contained" style? What's the "mock run"?
In the main TFDS repo, checksums are stored in tensorflow_datasets/checksums
, while here we are storing it in the dataset folder, e.g. tensorflow_graphics/datasets/modelnet40/checksums
.
That is, in TFG, all of the files for a dataset are contained within its folder (and subfolders). I interacted with Etienne to getting this workflow up and running a few weeks ago.
@jackd now also added tfds.testing.mock_data
in #310, but doing so requires:
1) the creation of a dataset_info.json (which is not great for contributors, but "ok" for now)
2) the dataset_info.json for all datasets to exist within a single folder (for all datasets); in our case we are using tensorflow_graphics/datasets/testing/metadata
Ok, so sounds like you and @Conchylicultor have discussed this previously; what was the conclusion of the discussion, or work that was considered/started? (If there's a related GitHub issue, please link; also Etienne can chime in here).
No, only discussed with Etienne, not @Conchylicultor.
There are two separate things that would be nice:
1) give the user the ability to choose where the metadata is stored (similar to tfds.download.add_checksums_dir(_CHECKSUM_DIR)
, I guess)
2) allow a mock_data workflow where the data is programmatically define within the test file (so not to have to depend on deployed data)
Feel free to reach out on GVC if you want to discuss in details ;)
For info, @Conchylicultor == Etienne,
I agree about those two points. I tried to answered by mail with more context. I can try to look at 1 by the end of the week/early next week.
Currently, storing dataset info in
tensorflow_graphics/datasets/testing/metadata
breaks the "self-contained" style we are following for TFDS datasets stored within TFG.We should modify the logic of the mock run so that
tensorflow_graphics/datasets/testing/metadata/model_net40/1.0.0/dataset_info.json
can be stored in something liketensorflow_graphics/datasets/model_net40/dataset_info.json
.CC'ing @rsepassi so he can provide some pointers on how to fix this?