lucien-roach / a-dda

Automatically exported from code.google.com/p/a-dda
0 stars 0 forks source link

storing dipoles as binary file #90

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What new or enhanced feature are you proposing?

for very large target it is not possible to store
the dipoles and/or the internal field as text.
A binary format (better if machine-independent, netcdf) is required. 

What goal would this enhancement help you achieve?

Near field calculations for large target

Original issue reported on code.google.com by fabio.de...@gmail.com on 23 Nov 2009 at 2:01

GoogleCodeExporter commented 8 years ago

Original comment by yurkin on 23 Nov 2009 at 2:53

GoogleCodeExporter commented 8 years ago

Original comment by yurkin on 23 Aug 2010 at 4:57

GoogleCodeExporter commented 8 years ago

Original comment by yurkin on 22 Apr 2011 at 2:40

GoogleCodeExporter commented 8 years ago

Original comment by yurkin on 4 Feb 2013 at 5:36

GoogleCodeExporter commented 8 years ago
issue 183 is a generalization of this issue.

Original comment by yurkin on 23 Nov 2013 at 8:21

GoogleCodeExporter commented 8 years ago
I did a little test and it seems that using the text format doesn't actually 
reduce the size of the input files that much. I experimented with a dipole file 
of about 408000 dipoles with positions ranging from -131 to 158. The file sizes 
were as follows:

Original file: 4.2 MB
Stored in binary as 16-bit ints: 2.4 MB
Original file gzipped: 0.92 MB
Binary file gzipped: 0.95 MB

So although the text file is about twice the size of the binary, it compresses 
much better. Therefore one option for reducing file sizes would be to implement 
reading of the current file formats in gzipped form. An even simpler 
alternative for reducing input file size would be to implement reading the 
dipole file from standard input, so one could pipe dipole data to ADDA through 
gunzip.

Original comment by jsleino...@gmail.com on 4 Dec 2013 at 2:21

GoogleCodeExporter commented 8 years ago
Masking stdout of gunzip as shape (pseudo)file can already be done on Unix like
./adda ... -shape read <( gunzip ... )
However, it fails because ADDA scans shape files two times. On the second scan 
the piped pseudofile happens to be empty. Such reading procedure is done for 
robustness and, in some sense, performance. There is also some random access to 
automatically determine the format of the file. So the only way I see to read 
stdin (or piped stream) is to buffer the stream in ADDA. But that seems to 
remove most of the benefits.

Second issue is that the problem is mostly relevant for large MPI runs, when 
shape files can be tens of GB, but each process takes only small part of it 
(still it can't buffer the whole file). For smaller shape files using 
temporarily file (instead of fifo) seems a fine solution.

Finally, there is another idea described in issue 31. It is probably not that 
efficient for very sparse particles (still worth trying), but can lead to 
several orders of magnitude compression for large homogeneous and relatively 
compact particles (which have the largest size among computationally feasible 
runs).

Original comment by yurkin on 4 Dec 2013 at 3:42