stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.56k stars 365 forks source link

Introduce Writer Callback #1361

Closed betanalpha closed 7 years ago

betanalpha commented 9 years ago

Refactor internal Stan code to use a writer callback class left for the interfaces to implement. Using the writer callback everywhere will remove the last traces of cout/cerr from the code (addressing #699) while giving the interfaces complete control over output (addressing #453).

ariddell commented 9 years ago

I appreciate what you're doing here, but it seems a bit hasty (esp. probe_args). It feels like one is inventing a protocol from scratch to communicate key value pairs via strings. At minimum, I think I'd prefer we pick something recognizable (like JSON) -- or do a little bit more work to articulate what are valid types of values and spec it in Protobuf.

I suppose I'd also like to see what the big C++ object approach looks like. I don't have that much time this week, but I'd be willing to give a shot at articulating my ideas in a cmdstan branch like this one.

betanalpha commented 9 years ago

To be clear, the argument stuff is not new. probe_args is for automated testing.

On Mar 19, 2015, at 5:52 PM, Allen Riddell notifications@github.com wrote:

I appreciate what you're doing here, but it seems a bit hasty (esp. probe_args). It feels like one is inventing a protocol from scratch to communicate key value pairs via strings. At minimum, I think I'd prefer we pick something recognizable (like JSON) -- or do a little bit more work to articulate what are valid types of values and spec it in Protobuf.

I suppose I'd also like to see what the big C++ object approach looks like. I don't have that much time this week, but I'd be willing to give a shot at articulating my ideas in a cmdstan branch like this one.

— Reply to this email directly or view it on GitHub.

ariddell commented 9 years ago

Hmm.. it would be cool if the branch was referenced here somehow (maybe add "Closes #1361" in the commit?). Here's a commit I was looking at: c68ac51d2edb3fb8e59cbeccdfffa31c1879338d

For Python and R I think it's fair to say that it's much easier to create a (pointer to) a struct (or C++ class) and pass that around than to write a C++ callback.

betanalpha commented 9 years ago

The callbacks here are just classes that implement certain functions — they should be straightforward to implement (see the cout/cerr and stream implementations that will eventually move to CmdStan).

On Mar 19, 2015, at 6:36 PM, Allen Riddell notifications@github.com wrote:

Hmm.. it would be cool if the branch was referenced here somehow (maybe add "Closes #1361" in the commit?). Here's a commit I was looking at: c68ac51

For Python and R I think it's fair to say that it's much easier to create a (pointer to) a struct (or C++ class) and pass that around than to write a C++ callback.

— Reply to this email directly or view it on GitHub.

ariddell commented 9 years ago

I suppose the argument against having interfaces have to write C++ functions starts like this: it's easier for interfaces (not written in C++) to pass around a pointer or a chunk of bytes without knowing too much about it. In fact, with JSON or protobuf they potentially don't even need to write C++ code at all (which I think would be great for encouraging folks to write interfaces). Having interfaces be required to implement a C++ class with specific functions is a significantly bigger hurdle. R is actually unusual in having good support for C++ with Rcpp. Cython only started supporting C++ very recently (in think within the last year or two). Anyway, these are pieces of evidence in support of providing an interface that requires that minimal (potentially 0) C++ code be written.

It might be helpful to check out how RStan (and PyStan, more or less) currently deal with getting config and defaults setup:

https://github.com/stan-dev/rstan/blob/master/rstan/rstan/inst/include/rstan/stan_args.hpp

bob-carpenter commented 9 years ago

If you really don't want to work at the C++ level, it should be possible to completely rewrite PyStan to follow Stan.jl and MatlabStan (and my original code for RStan), which would have the user specify a command-line argument as a string, write the data out to a file, and then execute a CmdStan process and then collect up the CSV data.

The major problem with this is that it involves an I/O roundtrip, but that's not usually a bottleneck.

It solves all of Michael's issues with drift. And it means you don't need to write any C++ at all.

It would still mean implementing all the posterior analysis stuff in Python. Or you could call the print function in CmdStan and just show it as a string. Or you could call the C++, but that'd put you back in C++ land.

On Mar 20, 2015, at 7:26 AM, Allen Riddell notifications@github.com wrote:

I suppose the argument against having interfaces have to write C++ functions starts like this: it's easier to for interfaces to pass around a pointer or a chunk of bytes without knowing too much about it. In fact, with JSON or protobuf they don't even need to touch C++ code potentially. Having them implement a C++ class with specific properties is a significantly bigger hurdle.

It might be helpful to check out how RStan (and PyStan, more or less) currently deal with getting config and defaults setup:

https://github.com/stan-dev/rstan/blob/master/rstan/rstan/inst/include/rstan/stan_args.hpp

— Reply to this email directly or view it on GitHub.

maverickg commented 9 years ago

On Thu, Mar 19, 2015 at 11:36:19AM -0700, Allen Riddell wrote:

Hmm.. it would be cool if the branch was referenced here somehow (maybe add "Closes #1361" in the commit?). Here's a commit I was looking at: c68ac51d2edb3fb8e59cbeccdfffa31c1879338d

For Python and R I think it's fair to say that it's much easier to create a (pointer to) a struct (or C++ class) and pass that around than to write a C++ callback. I also think callback should not be used too much. It adds indirection and inflexibility.

Jiqiang


Reply to this email directly or view it on GitHub: https://github.com/stan-dev/stan/issues/1361#issuecomment-83708981

ariddell commented 9 years ago

I think shelling out to call cmdstan is a mistake but my only evidence is that RStan and PyStan are working rather smoothly right now and could be working even more smoothly if the steps outlined in the Stan 3.0 wiki page were carried through (i.e., the Python and R interface/classes need to be refactored).

I suspect Julia would be following the same route as Python and R if the interface could be accessed without C++/with less C++.

bob-carpenter commented 9 years ago

I like the callback plan! It's the right way to write these kind of interfaces for flexibility. Jiqiang --- what did you think was inflexible?

It does add a bit of indirection, or more specifically, infersion of control:

http://en.wikipedia.org/wiki/Inversion_of_control

How do you propose returning results without something like a callback? Were you thinking just getting a protocol buffer object as a string or something like a C++ data structure returned? Both are heavy compared to callbacks, which don't produce any long-lived memory overhead, much less conversions to strings and back, which is really slow. They also require all the memory to be buffered up, whereas CmdStan is going to stream answers to an output stream (typically a file).

I'm having a bit of trouble seeing how we can stay away from C++ and avoid just calling CmdStan as an independent process doing everything through files.

On Mar 20, 2015, at 10:04 AM, maverickg notifications@github.com wrote:

On Thu, Mar 19, 2015 at 11:36:19AM -0700, Allen Riddell wrote:

Hmm.. it would be cool if the branch was referenced here somehow (maybe add "Closes #1361" in the commit?). Here's a commit I was looking at: c68ac51d2edb3fb8e59cbeccdfffa31c1879338d

For Python and R I think it's fair to say that it's much easier to create a (pointer to) a struct (or C++ class) and pass that around than to write a C++ callback. I also think callback should not be used too much. It adds indirection and inflexibility.

Jiqiang


Reply to this email directly or view it on GitHub: https://github.com/stan-dev/stan/issues/1361#issuecomment-83708981 — Reply to this email directly or view it on GitHub.

sakrejda commented 9 years ago

Are these things really as mutually exclusive as it sounds? With a consistent C++-level interface which requires callbacks I don't think it would be too hard to write standard callbacks for interfacing with protocol buffers much like Michael wrote some standard ones for working with streams. At least that's my impression based on the simple tutorials: https://developers.google.com/protocol-buffers/docs/cpptutorial. That adds that pinch point where interfaces could be standardized without pushing protocol-buffers stuff deep into Stan. It also gets you around C++ being harder to call than C and if somebody wants to make a high-performance interface they can write custom callbacks in C++. I'm having a hard time telling if that's over-engineering but I don't think so.

bob-carpenter commented 9 years ago

All else being equal, I'd prefer not to have a chain of dominos in the interface --- they're hard to maintain.

Also, if the Protocol Buffer to C++ structure isn't in Stan, it doesn't solve the interface's problem of not wanting to write so much C++.

One place where Protocol buffers aren't sufficient is for interrupts --- we need at least one interface-driven callback for interrupts so we can stop C++ Stan's looping if the interface wants to stop.

On Mar 21, 2015, at 4:12 AM, Krzysztof Sakrejda notifications@github.com wrote:

Are these things really as mutually exclusive as it sounds? With a consistent C++-level interface which requires callbacks I don't think it would be too hard to write standard callbacks for interfacing with protocol buffers much like Michael wrote some standard ones for working with streams. At least that's my impression based on the simple tutorials: https://developers.google.com/protocol-buffers/docs/cpptutorial. That adds that pinch point where interfaces could be standardized without pushing protocol-buffers stuff deep into Stan. It also gets you around C++ being harder to call than C and if somebody wants to make a high-performance interface they can write custom callbacks in C++. I'm having a hard time telling if that's over-engineering but I don't think so.

— Reply to this email directly or view it on GitHub.

bob-carpenter commented 9 years ago

On Mar 20, 2015, at 10:16 AM, Allen Riddell notifications@github.com wrote:

I think shelling out to call cmdstan is a mistake but my only evidence is that RStan and PyStan are working rather smoothly right now and could be working even more smoothly if the steps outlined in the Stan 3.0 wiki page were carried through (i.e., the Python and R interface/classes need to be refactored).

Things are definitely much smoother than they use to be, but we still face:

Also, a thin wrapper would solve the

A very thin wrapper (just passing a string) would mean

I suspect Julia would be following the same route as Python and R if the interface could be accessed without C++.

See Rob's response.

ariddell commented 9 years ago

Hi Bob,

On 03/20, Bob Carpenter wrote:

All else being equal, I'd prefer not to have a chain of dominos in the interface --- they're hard to maintain.

I agree, but the potential gains of having an extremely smooth installation experience (via install.packages or "pip install pystan") could trump this, I think. I realize there are some R packages that compile and call external programs (topicmodels comes to mind, which I think compiles and runs GibbsLDA++), but this approach is less common in Python.

Also, if the Protocol Buffer to C++ structure isn't in Stan, it doesn't solve the interface's problem of not wanting to write so much C++.

This reflects my thinking as well.

One place where Protocol buffers aren't sufficient is for interrupts --- we need at least one interface-driven callback for interrupts so we can stop C++ Stan's looping if the interface wants to stop.

I'm perfectly fine with the callback here. It happens that the callback can be in C -- which is nice for (future) interfaces that do not have support for interfacing with C++.

ariddell commented 9 years ago

It has occurred to me that it would be helpful if we had some examples of other software projects that deal with similar problems. Clearly Google has solved the problem by passing around Protocol Buffers for everything. But closer to home (i.e., scientific computing) is the Juypter/IPython project, which is a bit like RStudio but supports working with Octave, R, Julia, and Python. Here's how it works: http://ipython.org/ipython-doc/3/development/how_ipython_works.html

Basically IPython exposes ZeroMQ sockets and passes around everything with JSON. It's easy to appreciate that this makes interoperability trivial for any language that supports JSON and ZeroMQ (i.e., everything). Of course, in principle it's not too far removed from the Google approach. It's also pretty slow.

That this discussion is taking time makes perfect sense to me. Up until this point there hasn't been much/any effort to coordinate development between R/Py/CmdStan when that coordination involved more than 2 people.

Perhaps it would be helpful if I added more details to an alternative that avoided callbacks to the Wiki page? I'd also like to hear with @maverickg thinks about all of this.

ariddell commented 9 years ago

(in response to @bob-carpenter's earlier comment; github needs threaded issue discussions!)

I don't mind updating PyStan every release of Stan (especially if library path renaming is kept to a minimum).

Here's Rob's comment from the list:

"As far as Julia and today’s Stan.jl are concerned, for Julia 0.3.x Allen’s earlier comment about following R and Python without C++ is correct. As alternatives folks can also fall back to RCall.jl and/or PyCall.jl."

bob-carpenter commented 9 years ago

Replying to two mails from Allen at once:

...

I agree, but the potential gains of having an extremely smooth installation experience (via install.packages or "pip install pystan") could trump this, I think. I realize there are some R packages that compile and call external programs (topicmodels comes to mind, which I think compiles and runs GibbsLDA++), but this approach is less common in Python.

Many R packages compile and run extern C, C++ or Fortran code. The obstacle presented by Stan is that we need to compile each model at runtime.

So that smoothe installation is only going to happen for Stan for someone who has a recent enough C++ compiler installed.

...

Perhaps it would be helpful if I added more details to an alternative that avoided callbacks to the Wiki page? I'd also like to hear with @maverickg thinks about all of this.

That would be helpful to me at least.

ariddell commented 9 years ago

Added to the wiki a sketch of how the command function would work in a Protocol Buffer world:

https://github.com/stan-dev/stan/wiki/Stan-API-Refactor#command-with-protocol-buffers

Let me know what you think.

On 03/21, Bob Carpenter wrote:

Replying to two mails from Allen at once:

...

I agree, but the potential gains of having an extremely smooth installation experience (via install.packages or "pip install pystan") could trump this, I think. I realize there are some R packages that compile and call external programs (topicmodels comes to mind, which I think compiles and runs GibbsLDA++), but this approach is less common in Python.

Many R packages compile and run extern C, C++ or Fortran code. The obstacle presented by Stan is that we need to compile each model at runtime.

So that smoothe installation is only going to happen for Stan for someone who has a recent enough C++ compiler installed.

...

Perhaps it would be helpful if I added more details to an alternative that avoided callbacks to the Wiki page? I'd also like to hear with @maverickg thinks about all of this.

That would be helpful to me at least.

  • Bob

Reply to this email directly or view it on GitHub: https://github.com/stan-dev/stan/issues/1361#issuecomment-84494026

bob-carpenter commented 9 years ago

I don't think passing C++ objects is a problem --- we're connecting to them directly in C++, not trying to serialize as in protocol buffers.

As popular as protocol buffers are at Google (not an uncorrelated set of independent choices to use protocol buffers), are they used to communicate between C++ programs in memory (the way PyStan and RStan link to Stan) or just for serialized for transport via file or socket?

We looked at something like JSON, and it's very heavy to have structure per parameter group because of all the added brackets.
Right now, we're just passing the parameter values for a draw as an unstructured array, the downside of which is that it needs to be translated back to structure for the extractors (at least in RStan).

On Mar 24, 2015, at 12:19 PM, Allen Riddell notifications@github.com wrote:

Added to the wiki a sketch of how the command function would work in a Protocol Buffer world:

https://github.com/stan-dev/stan/wiki/Stan-API-Refactor#command-with-protocol-buffers

Let me know what you think.

On 03/21, Bob Carpenter wrote:

Replying to two mails from Allen at once:

...

I agree, but the potential gains of having an extremely smooth installation experience (via install.packages or "pip install pystan") could trump this, I think. I realize there are some R packages that compile and call external programs (topicmodels comes to mind, which I think compiles and runs GibbsLDA++), but this approach is less common in Python.

Many R packages compile and run extern C, C++ or Fortran code. The obstacle presented by Stan is that we need to compile each model at runtime.

So that smoothe installation is only going to happen for Stan for someone who has a recent enough C++ compiler installed.

...

Perhaps it would be helpful if I added more details to an alternative that avoided callbacks to the Wiki page? I'd also like to hear with @maverickg thinks about all of this.

That would be helpful to me at least.

  • Bob

Reply to this email directly or view it on GitHub: https://github.com/stan-dev/stan/issues/1361#issuecomment-84494026 — Reply to this email directly or view it on GitHub.

ariddell commented 9 years ago

My impression is that protocol buffers are used to communicate between distinct machines running services written in C++ (or Go).

Ultimately, I think the reason I'm attracted to Protocol Buffers is because it could make communication to and from Stan easier in a variety of settings (including the case where one machine is compiling and running the model and sending results to another machine). Allowing communication to and from R and Python has been the source of a lot of interesting work recently, including IPython and Shiny.

IPython and Shiny (somehow R and node.js talk to each other, right?) are making scientific computing much more accessible by abstracting (in some cases) away the computer (e.g., you can run IPython and RStudio on an ephemeral cloud server). In this context having some platform-agnostic way of sending and receiving data to and from stan seems like a good goal.

From the meeting yesterday I think we all agreed that the callback method would go forward and then I would be free to try things out on my own using Protocol Buffers and see how much of a performance hit (or gain?) there was.

re: callback method refactoring. I mentioned that I particularly like Bob's command function signature in the Wiki. I hope we can eventually get to something that simple.

bob-carpenter commented 9 years ago

On Mar 26, 2015, at 12:40 AM, Allen Riddell notifications@github.com wrote:

My impression is that protocol buffers are used to communicate between distinct machines running services written in C++ (or Go).

Ultimately, I think the reason I'm attracted to Protocol Buffers is because it could make communication to and from Stan easier in a variety of settings (including the case where one machine is compiling and running the model and sending results to another machine). Allowing communication to and from R and Python has been the source of a lot of interesting work recently, including IPython and Shiny.

I guess that need comes up because people want to mix R and Python packages and neither language has a complete set.

The danger is robustness and performance from having to write all the adapters and munge data back-and-forth, but I guess for most of what people do, neither of these is an issue --- if it were they probably wouldn't be using R or Python! R users seem to be happy to just restart R or RStudio when something crashes. I think that stems from them primarily being user-facing front ends rather than back-end servers and so not requiring any kind of service level agreements.

IPython and Shiny (somehow R and node.js talk to each other, right?)

Apparently not very robustly --- Jonah just told me there's a known (or suspected) bug in R's JSON transport package that causes Shiny apps to hang. I unfortunately ran into that one during a demo of shinyStan two days ago. As soon as things start to break like this, I begin to lose confidence in them. (Not an issue with shinyStan yet --- it's still early days there and I'm not expecting a lot of robustness yet, though I think I'm going to stay away from live demos until I can convince myself I can do them without freezing. It is a major issue with Stan, and we're still struggling with robustness, especially across platforms; it's why we keep prioritizing robustness and error checking over new features.)

are making scientific computing much more accessible by abstracting (in some cases) away the computer (e.g., you can run IPython and RStudio on an ephemeral cloud server). In this context having some platform-agnostic way of sending and receiving data to and from stan seems like a good goal.

Agreed. I just wanted to point out that it's a different goal than communicating between PyStan and the Stan C++ layer. We control both ends of that pipeline.

From the meeting yesterday I think we all agreed that the callback method would go forward and then I would be free to try things out on my own using Protocol Buffers and see how much of a performance hit (or gain?) there was.

Sounds good --- I think that's the right level of abstraction. We'd implement some kind of data serialization on top of a direct C++ API. The costs of encoding/decoding are usually fairly small compared to transport between machines, or even between disk and back.

syclik commented 9 years ago

Michael, did you change interface to interface_callbacks?

betanalpha commented 9 years ago

I did — I though it was more descriptive and would avoid confusion.

On Apr 3, 2015, at 3:34 PM, Daniel Lee notifications@github.com wrote:

Michael, did you change interface to interface_callbacks?

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

ok. I merged develop into the branch.

syclik commented 9 years ago

@betanalpha, were you able to run the unit tests on this branch? I've fixed a handful, but can't get all of it to run.

syclik commented 9 years ago

@betanalpha, don't worry about it... I'm almost through it, I think.

betanalpha commented 9 years ago

The tests were all working at my commit, but I haven’t checked anything after all of the recent merges...

On May 27, 2015, at 11:02 PM, Daniel Lee notifications@github.com wrote:

@betanalpha, don't worry about it... I'm almost through it, I think.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

That's what I was hoping you'd say. I'll get it working.

On Wednesday, May 27, 2015, Michael Betancourt notifications@github.com wrote:

The tests were all working at my commit, but I haven’t checked anything after all of the recent merges...

On May 27, 2015, at 11:02 PM, Daniel Lee <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

@betanalpha, don't worry about it... I'm almost through it, I think.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1361#issuecomment-106098154.

syclik commented 9 years ago

Ok, so there was definitely a test that wasn't running -- src/test/unit/mcmc/chains_test.cpp. I've gotten it to compile, but there's still an issue: the parser() function doesn't like the output generated by the current version of Stan. Know what's going on?

On Wed, May 27, 2015 at 6:42 PM, Daniel Lee bearlee@alum.mit.edu wrote:

That's what I was hoping you'd say. I'll get it working.

On Wednesday, May 27, 2015, Michael Betancourt notifications@github.com wrote:

The tests were all working at my commit, but I haven’t checked anything after all of the recent merges...

On May 27, 2015, at 11:02 PM, Daniel Lee notifications@github.com wrote:

@betanalpha, don't worry about it... I'm almost through it, I think.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1361#issuecomment-106098154.

betanalpha commented 9 years ago

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

On May 28, 2015, at 11:10 PM, Daniel Lee notifications@github.com wrote:

Ok, so there was definitely a test that wasn't running -- src/test/unit/mcmc/chains_test.cpp. I've gotten it to compile, but there's still an issue: the parser() function doesn't like the output generated by the current version of Stan. Know what's going on?

On Wed, May 27, 2015 at 6:42 PM, Daniel Lee bearlee@alum.mit.edu wrote:

That's what I was hoping you'd say. I'll get it working.

On Wednesday, May 27, 2015, Michael Betancourt notifications@github.com wrote:

The tests were all working at my commit, but I haven’t checked anything after all of the recent merges...

On May 27, 2015, at 11:02 PM, Daniel Lee notifications@github.com wrote:

@betanalpha, don't worry about it... I'm almost through it, I think.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1361#issuecomment-106098154.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

This one isn't a false-positive. I just regenerated the test files. I tried to debug it, but the parse method is just choking.

I'll take an alternative if you have one. You're right: this isn't a really good way to do a unit test. We need to separate out the file processing from the chains object, but we don't have a good way to do that right now.

PS. these files haven't changed since Dec 2013. The history before that isn't easy to get to. If you want to have these generated as part of the test, let me know. I think I can figure that out. But... I don't think that really helps, right?

On Thu, May 28, 2015 at 6:42 PM, Michael Betancourt < notifications@github.com> wrote:

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

betanalpha commented 9 years ago

Did you regenerate them with the corresponding cmdstan branch? It doesn’t look like the test files have the update format, hence the lexical cast errors. I’ll try to fix the test later this afternoon.

On May 29, 2015, at 1:35 AM, Daniel Lee notifications@github.com wrote:

This one isn't a false-positive. I just regenerated the test files. I tried to debug it, but the parse method is just choking.

I'll take an alternative if you have one. You're right: this isn't a really good way to do a unit test. We need to separate out the file processing from the chains object, but we don't have a good way to do that right now.

PS. these files haven't changed since Dec 2013. The history before that isn't easy to get to. If you want to have these generated as part of the test, let me know. I think I can figure that out. But... I don't think that really helps, right?

On Thu, May 28, 2015 at 6:42 PM, Michael Betancourt < notifications@github.com> wrote:

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

— Reply to this email directly or view it on GitHub.

betanalpha commented 9 years ago

By the way, at the end of the whole refactor it will be easy to run models solely through the C++ API allowing us to auto generate files for tests like these independent of any interface. The interfaces can then test their own implementations independently.

On May 29, 2015, at 9:55 AM, Michael Betancourt betanalpha@gmail.com wrote:

Did you regenerate them with the corresponding cmdstan branch? It doesn’t look like the test files have the update format, hence the lexical cast errors. I’ll try to fix the test later this afternoon.

On May 29, 2015, at 1:35 AM, Daniel Lee notifications@github.com wrote:

This one isn't a false-positive. I just regenerated the test files. I tried to debug it, but the parse method is just choking.

I'll take an alternative if you have one. You're right: this isn't a really good way to do a unit test. We need to separate out the file processing from the chains object, but we don't have a good way to do that right now.

PS. these files haven't changed since Dec 2013. The history before that isn't easy to get to. If you want to have these generated as part of the test, let me know. I think I can figure that out. But... I don't think that really helps, right?

On Thu, May 28, 2015 at 6:42 PM, Michael Betancourt < notifications@github.com> wrote:

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

Thanks. I used the latest develop branch: there was still something incompatible about this branch. If all it takes is to run again, I'll do that.

On Friday, May 29, 2015, Michael Betancourt notifications@github.com wrote:

By the way, at the end of the whole refactor it will be easy to run models solely through the C++ API allowing us to auto generate files for tests like these independent of any interface. The interfaces can then test their own implementations independently.

On May 29, 2015, at 9:55 AM, Michael Betancourt <betanalpha@gmail.com javascript:_e(%7B%7D,'cvml','betanalpha@gmail.com');> wrote:

Did you regenerate them with the corresponding cmdstan branch? It doesn’t look like the test files have the update format, hence the lexical cast errors. I’ll try to fix the test later this afternoon.

On May 29, 2015, at 1:35 AM, Daniel Lee <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

This one isn't a false-positive. I just regenerated the test files. I tried to debug it, but the parse method is just choking.

I'll take an alternative if you have one. You're right: this isn't a really good way to do a unit test. We need to separate out the file processing from the chains object, but we don't have a good way to do that right now.

PS. these files haven't changed since Dec 2013. The history before that isn't easy to get to. If you want to have these generated as part of the test, let me know. I think I can figure that out. But... I don't think that really helps, right?

On Thu, May 28, 2015 at 6:42 PM, Michael Betancourt < notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1361#issuecomment-106750344.

betanalpha commented 9 years ago

Yeah, fixing the format of the output files avoid the read_csv issue. But all the tests fail because the diagnostics are all slightly off due to the chains being rerun. I’ll let you figure out what you want to do there.

Also, perhaps we might as well move stan_csv_reader to cmdstan in this pull request? None of the Stan tests should depend on the cmdstan output or require that it be run.

On May 29, 2015, at 12:16 PM, Daniel Lee notifications@github.com wrote:

Thanks. I used the latest develop branch: there was still something incompatible about this branch. If all it takes is to run again, I'll do that.

On Friday, May 29, 2015, Michael Betancourt notifications@github.com wrote:

By the way, at the end of the whole refactor it will be easy to run models solely through the C++ API allowing us to auto generate files for tests like these independent of any interface. The interfaces can then test their own implementations independently.

On May 29, 2015, at 9:55 AM, Michael Betancourt <betanalpha@gmail.com javascript:_e(%7B%7D,'cvml','betanalpha@gmail.com');> wrote:

Did you regenerate them with the corresponding cmdstan branch? It doesn’t look like the test files have the update format, hence the lexical cast errors. I’ll try to fix the test later this afternoon.

On May 29, 2015, at 1:35 AM, Daniel Lee <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

This one isn't a false-positive. I just regenerated the test files. I tried to debug it, but the parse method is just choking.

I'll take an alternative if you have one. You're right: this isn't a really good way to do a unit test. We need to separate out the file processing from the chains object, but we don't have a good way to do that right now.

PS. these files haven't changed since Dec 2013. The history before that isn't easy to get to. If you want to have these generated as part of the test, let me know. I think I can figure that out. But... I don't think that really helps, right?

On Thu, May 28, 2015 at 6:42 PM, Michael Betancourt < notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Or the updated CSV reader doesn’t like static, deprecated files in the test_csv_files folder? I’ve hated these tests from the beginning because they never catch anything and just require us to change them every f’ing time to pass.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1361#issuecomment-106750344.

— Reply to this email directly or view it on GitHub.

syclik commented 8 years ago

Here are the things that weren't included in stan-dev/stan#1616:

I'm leaving this here so I don't forget.

bob-carpenter commented 7 years ago

@syclik Is this all done and rolled into the refactor? If so, would you mind closing?

bob-carpenter commented 7 years ago

This is done.