paris-saclay-cds / ramp-workflow

Toolkit for building predictive workflows on top of pydata (pandas, scikit-learn, pytorch, keras, etc.).
https://paris-saclay-cds.github.io/ramp-docs/
BSD 3-Clause "New" or "Revised" License
68 stars 42 forks source link

MNT stop using colored in Windows since it does not work #224

Closed glemaitre closed 4 years ago

glemaitre commented 4 years ago

closes #157

It is annoying when using a windows command prompt.

codecov[bot] commented 4 years ago

Codecov Report

Merging #224 into master will increase coverage by 0.02%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #224      +/-   ##
==========================================
+ Coverage   81.59%   81.62%   +0.02%     
==========================================
  Files         130      130              
  Lines        4853     4860       +7     
==========================================
+ Hits         3960     3967       +7     
  Misses        893      893
Impacted Files Coverage Δ
rampwf/utils/pretty_print.py 100% <100%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e0de28a...98bddb6. Read the comment docs.

kegl commented 4 years ago

@glemaitre Please unmerge this, colors do work in windows, both in the jupyter terminal and in cmder (which is the terminal app everyone should use for other reasons). In the team we all use windows and we want to continue to have colors. If you really need to do this, please make it optional (like a --nocolors switch).

glemaitre commented 4 years ago

I don't think that rolling back is an option:

Basically, I had to make a choice between you all having colours or my users seeing cryptic outputs in their default shell or you all seeing score in black and white and my users able to see the results. I think this is the best default for now.

I agree that we should make an effort to improving support. So instead to detect if we are on windows which should check which prompt is requiring the command line. If it is one supporting colour then we output with colour otherwise not.

agramfort commented 4 years ago

@kegl is there a way to detect that you are in a modern windows shell from Python? maybe cmder sets some env variable we can use? I've also see many windows users struggle with colors and end up running the ramp-test from jupyter to see something. I agree that we should aim with something that works with default environments.

kegl commented 4 years ago

@agramfort I have no idea if we can know if we are in commander, but it's also the same running from the jupyter notebook (which is a great option for students, if you ask me). (Actually, for teaching, I would use a function to launch training so they never have to leave the notebook.)

@glemaitre

  1. Installing cmder is easy and it will make your life easier in zillion different ways.
  2. Everything can be launched from cmder, student won't see the difference.
  3. Those who can't install cmder and don't want to use the jupyter interface can use a switch.

Look guys, it will be funny to split on such a minor issue, but I'm ready to do it. In my group we can't afford our everyday tool being changed unilaterally.

agramfort commented 4 years ago

from commander can you report what you see with:

import os print(os.environ)

agramfort commented 4 years ago

for example my terminal contains:

'COLORTERM': 'truecolor'

in environ

glemaitre commented 4 years ago

Look guys, it will be funny to split on such a minor issue, but I'm ready to do it. In my group we can't afford our everyday tool being changed unilaterally.

I would have really expected a bit more appreciation on some work that I carry on during my own free time for the benefit of RAMP users, either in ramp-board and ramp-workflow while this is far away from my daily job. Hoping that @kegl and Huawei can give a long life to both project because it will be without me from now on.

kegl commented 4 years ago

@glemaitre I appreciate very much what you guys did with Joris. But we still can't afford to get our everyday tool changed unilaterally. I'm saying this with a cold head. We're planning to have quite a bit of change in ramp-workflow, for example on how the cross validation is used, possibly breaking backward compatibility, at which point we would probably split anyway.

albertcthomas commented 4 years ago

@kegl let's discuss about this in real life.

kegl commented 4 years ago

@agramfort I don't have COLORTERM in any of my terminals. In linux and mac I have TERM='xterm-256color' and in cmder I have TERM='cygwin']. In windows system terminal I have no TERM defined. In Jupyter launched from cmder (and also on my mac) TERM='xterm-color'. I can't launch jupyter from the windows term, some DLL is missing.

I'd be happy with any setup that doesn't change the default behavior. What about e.g. asking for os.environ['TERM'], and if it is not defined, have no color?

kegl commented 4 years ago

Or a bit more conservative: enumerate all the TERMs that we know supporting colors, then if someone needs it for a new TERM, add it to the list.

glemaitre commented 4 years ago

But we still can't afford to get our everyday tool changed unilaterally. I'm saying this with a cold head.

IMHO, the current PR was a bug fix for users with vanilla Windows install (unreadable output). It includes a regression for advanced Windows installs (black and white output) which then requires an enhancement as I proposed at the end of https://github.com/paris-saclay-cds/ramp-workflow/pull/224#issuecomment-592975116

ramp-workflow has now versioning and can be installed from PyPI meaning that you can downgrade easily if one regression is really problematic (it will not change from Friday to Monday when you come back from the weekend). When problematic, I would expect users/devs to open an issue such that we discuss the way to solve it and implement it for the benefit of the whole project users.

If this is not the way that we are going to work, I will not invest any of my free time in this project. I don't want to deal with unconstructive feedbacks (I does not work for me/us so don't change it).

We're planning to have quite a bit of change in ramp-workflow, for example on how the cross validation is used, possibly breaking backward compatibility, at which point we would probably split anyway.

In this case, we could adapt ramp-engine and simplify the code in such way we do not require ramp-worfklow. The process could be done in the following manner (we should only need compatible scorers).

from problem import get_train_data
from problem import get_test_data
from sklearn.model_selection import cross_validate
from sklearn.model_selection import ShuffleSplit

X_train, y_train = get_train_data()
X_test, y_test = get_test_data()
cv = ShuffleSplit(n_splits=8, random_state=57)

cv_results = cross_validate(
 get_estimator(), X_train, y_train, cv=cv,
 scoring="neg_mean_squared_error",
 return_train_score=True, return_estimator=True,
)
df_results = pd.DataFrame({
 "train": cv_results["train_score"],
 "valid": cv_results["test_score"],
})

df_results = df_results.apply(lambda x: np.sqrt(-x))

from sklearn.metrics import mean_squared_error
scores = []
for est in cv_results["estimator"]:
 y_pred = est.predict(X_test)
 score = mean_squared_error(y_test, y_pred)
 scores.append(np.sqrt(score))
df_results["test"] = scores
df_results

It is true that it is missing that bagged score will be missing but it would be supported by contributing to https://github.com/scikit-learn/scikit-learn/pull/15907

agramfort commented 4 years ago

What about e.g. asking for os.environ['TERM'], and if it is not defined, have no color?

this seems to do the job. Can you try a PR and report back to us using a couple of different windows machines?

I will ask around too

kegl commented 4 years ago

@glemaitre: yes, that's the general direction we have in mind, getting one level up in the abstraction, letting scripts to be specified, and only controling the input (set of submission files) and output (score matrix). We should have a separate issue on this to work out the specification.

albertcthomas commented 4 years ago

colors are not supported in the anaconda prompt. this is the output:

Training submissions\starting_kit ...
CV fold 0
glemaitre commented 4 years ago

Yes it was the point of this PR. The idea now would be to detect that we are running the command from this shell to avoid coloring by default.

Sent from my phone - sorry to be brief and potential misspell.

kegl commented 4 years ago

https://github.com/paris-saclay-cds/ramp-workflow/pull/225