TorchDSP / torchsig

TorchSig is an open-source signal processing machine learning toolkit based on the PyTorch data handling pipeline.
MIT License
170 stars 38 forks source link

350gb for sig53 dataset? #154

Closed maker-slim closed 1 year ago

maker-slim commented 1 year ago

After I ran generate_sig53.py and specified the download path, the sig53 dataset actually needed to prepare about 350G of space, because I modified it in the writer.py file: def init(self, path: str, *args, kwargs): super(LMDBDatasetWriter, self).init(*args, *kwargs) self.path = path self.env = lmdb.Environment(path="E:\code\glaucus-1.1.3\data\sig53_clean_train", subdir=True, map_size=int(1e11), max_dbs=2) self.data_db = self.env.open_db(b"data") self.label_db = self.env.open_db(b"label") map_size=int(1e11),at this time, the download stopped 24%, then I changed to 2e11 and downloaded 47%, I see that I downloaded about 175g when it stopped at this time, I want to ask this dataset does not seem to be so large, so is it normal now, where can I change it 24%|██▎ | 41809/176666 [07:24<23:54, 94.02it/s] Traceback (most recent call last): File "E:\code\torchsig-0.4.0\scripts\generate_sig53.py", line 64, in main() File "D:\anaconda3\envs\try\lib\site-packages\click\core.py", line 1157, in call return self.main(args, kwargs) File "D:\anaconda3\envs\try\lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) File "D:\anaconda3\envs\try\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, *ctx.params) File "D:\anaconda3\envs\try\lib\site-packages\click\core.py", line 783, in invoke return __callback(args, **kwargs) File "E:\code\torchsig-0.4.0\scripts\generate_sig53.py", line 53, in main generate(root, configs) File "E:\code\torchsig-0.4.0\scripts\generate_sig53.py", line 31, in generate creator.create() File "E:\code\torchsig-0.4.0\torchsig\utils\writer.py", line 159, in create self.writer.write(batch) File "E:\code\torchsig-0.4.0\torchsig\utils\writer.py", line 119, in write txn.put( lmdb.MapFullError: mdb_put: MDB_MAP_FULL: Environment mapsize limit reached Can the author give me a clue?Really hope to get help!

gvanhoy commented 1 year ago

Sig53 is around 600G large, do you have the diskspace?