hirrolot / datatype99

Algebraic data types for C99
MIT License
1.38k stars 23 forks source link

lot's of memory use during compilation #5

Closed picca closed 3 years ago

picca commented 3 years ago

I work on a old computer with only 4 Go of memory.

I am just using two datatype

datatype(
        DetectorType,
        (ImXpadS70, struct imxpad_t),
        (ImXpadS140, struct imxpad_t),
        (XpadFlatCorrected, struct rectangular_t),
        (Eiger1M, struct dectris_t)
        );

datatype(
        detector_t,
        (Detector, const char *, struct shape_t, DetectorType)
        );

since my last update, the amount of memory use during the compilation exploded. I am wondering if the unroll optimisation is not the culprite ? It was ok with the code of the 14th of Febuary.

Cheers

Fred

hirrolot commented 3 years ago

I've started to use lists instead of plain variadics, this is why compilation got slower. I'm trying to optimise all this machinery now, including the evaluator itself.

hirrolot commented 3 years ago

Well, I've done some optimisation. Is it better now, @picca? (Download the latest Metalang99 and Datatype99 commits.)

picca commented 3 years ago

Instead of 2.5Go of memory per file I now use 1.5Go. So it is an improvement. I would say that this is quite huge for only one datatype. (I removed the other one).

hirrolot commented 3 years ago

BTW, how do you measure memory consumption?

picca commented 3 years ago

for now I just look at htop during the compilation of the file

hirrolot commented 3 years ago

Do you use precompiled headers?

picca commented 3 years ago

here with /usr/bin/time and only one file

    Command being timed: "make"
    User time (seconds): 5.78
    System time (seconds): 0.49
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.26
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 1328556
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 365222
    Voluntary context switches: 235
    Involuntary context switches: 86
    Swaps: 0
    File system inputs: 0
    File system outputs: 6240
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

the interesting part is Maximum resident set size (kbytes): 1328556 so 1.3Go

I was close ;)

picca commented 3 years ago

previously

    Command being timed: "make"
    User time (seconds): 9.47
    System time (seconds): 0.80
    Percent of CPU this job got: 98%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.48
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 2312980
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 65
    Minor (reclaiming a frame) page faults: 608795
    Voluntary context switches: 889
    Involuntary context switches: 132
    Swaps: 0
    File system inputs: 83816
    File system outputs: 6272
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

so Maximum resident set size (kbytes): 2312980 2.3G

picca commented 3 years ago

now the version I was happy with :))

    Command being timed: "make"
    User time (seconds): 2.70
    System time (seconds): 0.33
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.02
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 537624
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 156510
    Voluntary context switches: 230
    Involuntary context switches: 89
    Swaps: 0
    File system inputs: 0
    File system outputs: 4952
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

only 0.5Go

picca commented 3 years ago

I do not use precompiled headers

hirrolot commented 3 years ago

Consider using precompiled headers because they will cache the results of datatype(...). Also, if you're compiling on GCC, consider -ftrack-macro-expansion=0.

picca commented 3 years ago

with -ftrack-macro-expansion=0, it reduce a lot the memory used (this was with the favorable version)

Command being timed: "make"
User time (seconds): 2.67
System time (seconds): 0.14
Percent of CPU this job got: 100%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.80
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 92668
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 50882
Voluntary context switches: 229
Involuntary context switches: 72
Swaps: 0
File system inputs: 0
File system outputs: 5048
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
picca commented 3 years ago

same thing with the lates datattype99 and metalamg99

    Command being timed: "make"
    User time (seconds): 4.56
    System time (seconds): 0.16
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.73
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 202744
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 78937
    Voluntary context switches: 229
    Involuntary context switches: 245
    Swaps: 0
    File system inputs: 0
    File system outputs: 5144
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

only 200Mo

So it used the double of ressources. (tme and memory)

hirrolot commented 3 years ago

Well, again I've done some optimisations for lists. Now it's only 0m0,030s slower than the v0.2.0 version.

picca commented 3 years ago

Here my result

Command being timed: "make"
    User time (seconds): 3.63
    System time (seconds): 0.18
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.81
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 166232
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 69181
    Voluntary context switches: 231
    Involuntary context switches: 134
    Swaps: 0
    File system inputs: 0
    File system outputs: 5048
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

only 166Mo

hirrolot commented 3 years ago

For your DetectorType, it now prints

\time -f "%M" gcc playground.c -Imetalang99/include -I. -ftrack-macro-expansion=0 -E
58372
picca commented 3 years ago

it is better and better :)

    Command being timed: "make"
    User time (seconds): 3.09
    System time (seconds): 0.16
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.25
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 146452
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 65155
    Voluntary context switches: 229
    Involuntary context switches: 60
    Swaps: 0
    File system inputs: 0
    File system outputs: 5208
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

146Mo

hirrolot commented 3 years ago

I've optimised pattern matching a bit. At this time, I see nothing to be optimised more, so I'm going to release v0.3.0 now.

picca commented 3 years ago

here my numbers :)

    Command being timed: "make"
    User time (seconds): 3.25
    System time (seconds): 0.18
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.42
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 146604
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 66311
    Voluntary context switches: 230
    Involuntary context switches: 30
    Swaps: 0
    File system inputs: 0
    File system outputs: 4952
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0
hirrolot commented 3 years ago

How does it perform now?

picca commented 3 years ago
    Command being timed: "make"
    User time (seconds): 2.93
    System time (seconds): 0.21
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.12
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 133944
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 63174
    Voluntary context switches: 230
    Involuntary context switches: 60
    Swaps: 0
    File system inputs: 0
    File system outputs: 4952
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0
hirrolot commented 3 years ago

Ping @picca

picca commented 3 years ago

a lot better :))

    Command being timed: "make"
    User time (seconds): 2.54
    System time (seconds): 0.22
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.75
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 94696
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 1
    Minor (reclaiming a frame) page faults: 51775
    Voluntary context switches: 232
    Involuntary context switches: 38
    Swaps: 0
    File system inputs: 144
    File system outputs: 5144
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0
hirrolot commented 3 years ago

So it's almost like https://github.com/Hirrolot/datatype99/issues/5#issuecomment-785189950 (before I started to use lists)?

picca commented 3 years ago

yes only 2 Mo remaining

hirrolot commented 3 years ago

@picca, try again, please.

picca commented 3 years ago

it is a lot better :))

    Command being timed: "make"
    User time (seconds): 2.60
    System time (seconds): 0.15
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.74
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 66956
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 48917
    Voluntary context switches: 231
    Involuntary context switches: 99
    Swaps: 0
    File system inputs: 0
    File system outputs: 5208
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0
hirrolot commented 3 years ago

Nice, even better than the initial version.