cmu-phil / tetrad

Repository for the Tetrad Project, www.phil.cmu.edu/tetrad.
GNU General Public License v2.0
408 stars 112 forks source link

Saving a Tetrad session following bootstrapping search results a very large file #1648

Closed MikeKonrad closed 1 year ago

MikeKonrad commented 1 year ago

I've sent Joe the dataset I've been using. I run search (BOSS) with default settings except for the number of bootstrap samples.

This is using the tetrad-gui-7.4.0-launch.jar on macOS with Java: java version "18.0.2.1" 2022-08-18 Java(TM) SE Runtime Environment (build 18.0.2.1+1-1) Java HotSpot(TM) 64-Bit Server VM (build 18.0.2.1+1-1, mixed mode, sharing)

(Incidentally searching using BOSS a dataset that's 1600 rows (approx) and 17 numeric-valued columns with 10000 bootstrap samples takes only 35 seconds on my Mac (with whatever the default amount of RAM that running "java -jar *launch.jar" from my terminal takes.

cg09 commented 1 year ago

Is it confidential? Can Joe pass it to me?

Clark

On Sat, Jul 1, 2023 at 2:33 PM Mike Konrad @.***> wrote:

I've sent Joe the dataset I've been using. I run search (BOSS) with default settings except for the number of bootstrap samples.

This is using the tetrad-gui-7.4.0-launch.jar on macOS with Java: java version "18.0.2.1" 2022-08-18 Java(TM) SE Runtime Environment (build 18.0.2.1+1-1) Java HotSpot(TM) 64-Bit Server VM (build 18.0.2.1+1-1, mixed mode, sharing)

(Incidentally searching using BOSS a dataset that's 1600 rows (approx) and 17 numeric-valued columns with 10000 bootstrap samples takes only 35 seconds on my Mac (with whatever the default amount of RAM that running "java -jar *launch.jar" from my terminal takes.

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1648, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3OKLYTY5OAUBZR4MCYTXOBUP3ANCNFSM6AAAAAAZ26IC24 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MikeKonrad commented 1 year ago

Hi Clark, Certainly. If you are curious about what the variables represent, I can provide some context later.

Mike


From: cg09 @.> Sent: Saturday, July 1, 2023 3:14:49 PM To: cmu-phil/tetrad @.> Cc: Michael D Konrad @.>; Author @.> Subject: Re: [cmu-phil/tetrad] Saving a Tetrad session following bootstrapping search results a very large file (Issue #1648)

Is it confidential? Can Joe pass it to me?

Clark

On Sat, Jul 1, 2023 at 2:33 PM Mike Konrad @.***> wrote:

I've sent Joe the dataset I've been using. I run search (BOSS) with default settings except for the number of bootstrap samples.

This is using the tetrad-gui-7.4.0-launch.jar on macOS with Java: java version "18.0.2.1" 2022-08-18 Java(TM) SE Runtime Environment (build 18.0.2.1+1-1) Java HotSpot(TM) 64-Bit Server VM (build 18.0.2.1+1-1, mixed mode, sharing)

(Incidentally searching using BOSS a dataset that's 1600 rows (approx) and 17 numeric-valued columns with 10000 bootstrap samples takes only 35 seconds on my Mac (with whatever the default amount of RAM that running "java -jar *launch.jar" from my terminal takes.

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1648, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3OKLYTY5OAUBZR4MCYTXOBUP3ANCNFSM6AAAAAAZ26IC24 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://github.com/cmu-phil/tetrad/issues/1648#issuecomment-1616056007, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACI46PXL5PBNT6XLEP7K5MTXOBZKTANCNFSM6AAAAAAZ26IC24. You are receiving this because you authored the thread.Message ID: @.***>

cg09 commented 1 year ago

Mike,

Thanks, you can tell me after I do some guesswork.

On Sat, Jul 1, 2023 at 3:19 PM Mike Konrad @.***> wrote:

Hi Clark, Certainly. If you are curious about what the variables represent, I can provide some context later.

Mike


From: cg09 @.> Sent: Saturday, July 1, 2023 3:14:49 PM To: cmu-phil/tetrad @.> Cc: Michael D Konrad @.>; Author @.> Subject: Re: [cmu-phil/tetrad] Saving a Tetrad session following bootstrapping search results a very large file (Issue #1648)

Is it confidential? Can Joe pass it to me?

Clark

On Sat, Jul 1, 2023 at 2:33 PM Mike Konrad @.***> wrote:

I've sent Joe the dataset I've been using. I run search (BOSS) with default settings except for the number of bootstrap samples.

This is using the tetrad-gui-7.4.0-launch.jar on macOS with Java: java version "18.0.2.1" 2022-08-18 Java(TM) SE Runtime Environment (build 18.0.2.1+1-1) Java HotSpot(TM) 64-Bit Server VM (build 18.0.2.1+1-1, mixed mode, sharing)

(Incidentally searching using BOSS a dataset that's 1600 rows (approx) and 17 numeric-valued columns with 10000 bootstrap samples takes only 35 seconds on my Mac (with whatever the default amount of RAM that running "java -jar *launch.jar" from my terminal takes.

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1648, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AD4Y3OKLYTY5OAUBZR4MCYTXOBUP3ANCNFSM6AAAAAAZ26IC24>

. You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub< https://github.com/cmu-phil/tetrad/issues/1648#issuecomment-1616056007>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/ACI46PXL5PBNT6XLEP7K5MTXOBZKTANCNFSM6AAAAAAZ26IC24>.

You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1648#issuecomment-1616057058, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3OL45R2AU5IWMD5WHFDXOBZ2PANCNFSM6AAAAAAZ26IC24 . You are receiving this because you commented.Message ID: @.***>

jdramsey commented 1 year ago

@MikeKonrad According to the profiling, the list of bootstrap graphs is growing in size in the code. It's possible that by not saving these separate bootstrap graphs, space could be conserved in the saved session. Let me refactor all of the bootstrap graph fields in all of the algcomparison algorithm wrappers to be transient.

jdramsey commented 1 year ago

Alternatively, I could add a parameter to allow the user to decide whether the separate bootstrap graphs should be saved and set it to false by default.

MikeKonrad commented 1 year ago

Thank you, Joe. This will work well for me!

From: Joseph Ramsey @.> Reply-To: cmu-phil/tetrad @.> Date: Sunday, July 9, 2023 at 2:18 PM To: cmu-phil/tetrad @.> Cc: Michael Konrad @.>, Mention @.***> Subject: Re: [cmu-phil/tetrad] Saving a Tetrad session following bootstrapping search results a very large file (Issue #1648)

Alternatively, I could add a parameter to allow the user to decide whether the separate bootstrap graphs should be saved and set it to false by default.

— Reply to this email directly, view it on GitHubhttps://github.com/cmu-phil/tetrad/issues/1648#issuecomment-1627788489, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACI46PXIJ6I3PT6HGE7YDO3XPLYXXANCNFSM6AAAAAAZ26IC24. You are receiving this because you were mentioned.Message ID: @.***>

jdramsey commented 1 year ago

@MikeKonrad Great! Maybe I'll send you a jar if you want to test it. I just tested it out a few minutes ago with 1000 bootstraps for BOSS for a 60-node graph, and the saved session was a reasonable size. But if you're still having difficulties then there may be another problem lurking as well.

jdramsey commented 1 year ago

@MikeKonrad Yeah, that seems to have done the trick. I just loaded your file and did 5000 bootstraps with BOSS with that parameter set to false, and it saves out as a session of 501 KB, which is fine.

I'll make you a jar, though, and send it. You can try it at your convenience and close this issue if you're OK with it.

jdramsey commented 1 year ago

Mike checked this--it works