Hi, a quick question: does meteor-file-collection have data deduplication based on hash comparisons buit-in, or in other words, is it a content-adressable storage? Did you consider choosing a differen…
Currently, deduplication in the visualization workflow starts _after_ the input data has been staged and tiled. If deduplication is set to occur at any step in the workflow (staging, rasterization, an…
- We have scrapped creative writing data consisting of letters, articles, poems, announcements and memos.
- We need to create separate hugging face datasets for each tag.
For Tibetan poe…
The version of HKCanCor published on [HuggingFace](https://huggingface.co/datasets/nanyang-technological-university-singapore/hkcancor/tree/main) by NTU is different from the version offered by this l…
Combine erasure coding with data deduplication to simultaneously reduce the overall redundancy in data while increasing the redundancy of unique data. Deduplication also requires less network transfer…
To avoid the serialization cost when accessing data stored in the `PartitionStore`, we could add an LRU cache for frequently accessed data. Prime candidates could be the deduplication table and the `I…
The gen_blockchain_data deploys many contracts. It now has many duplication of deploy, get_block_number, blocks.insert, deployments.insert.
It would be great to extract functions to prevent code du…
# Issue
The current implementation of budget validation needs some refactor. On large sites, Purchase Orders with 100 items timeout while submitting.
# Bottlenecks
## Unavoidable nested-loop
We want all the opensource Tibetan word segmented data and save it in a standard format.
The format should be:
'source': 'བོད་ཀྱི་གླུ་གར་རོལ་དབྱངས་ལ་གཞི་རྩའི་ཐོག་ནས་དབྱ…
For finite element codes (ultimately based on T8code.jl) we would like to try incorporating t8code.
Naturally the results would be point/vertex based and not element based in most cases.
The solu…