Only 3 evaluation datasets are visible in custom charts

eu9ene commented 3 months ago

Custom charts are great and it's hard to find metrics in the tables but somehow they show results only for 3 datasets and lack the most important ones, the ones without augmentations.

https://wandb.ai/moz-translations/sl-en?nw=nwuserepavlov

La0 commented 3 months ago

Do you think we miss publication of some data in taskcluster ?

Or is the graph created with not enough data ? I'm unsure if this is a publication issue or a w&b configuration problem

eu9ene commented 3 months ago

@vrigal mentioned that we publish data to the tables below and I see more datasets available there. So maybe it's just a problem with the custom graphs.

eu9ene commented 3 months ago

We also saw that some of the tables were empty, maybe it's a W&B bug in displaying them:

vrigal commented 3 months ago

Weight & Biases tables does not display tables well, but by going back and forth many times I saw data is present for 5/8 tables. The above are empty though:

flores_devtest_table
flores_aug-mix_devtest_table
mtdata_Neulab-tedtalks_test-1-eng-dan_table

I'm investigating why we miss data on those ones from online publication.

About the missing bar charts, I don't know what happen. It is the Python client that automatically builds those "custom" charts. It would be a lot of effort to handle this manually. Maybe some of the charts were directly deleted by an user (this is not reversible), or the W&B Python client failed to build them. Anyhow it would be nice to prevent anyone to delete those charts, if possible.

By the way, I noticed that it is possible to group bars via id instead of display name (related to https://github.com/mozilla/firefox-translations-training/issues/408#issuecomment-2187012317), which can temporary help with comparing runs with similar names among multiple groups: Screenshot_2024-06-26_15-21-35

Unfortunately, W&B Python client does not provide any help doing this. I also tested adding a third column with a comparable name, but it does not appears in the chart GroupKeys select. So in the end manually group by ID should be the more reasonable way to do it without investing much effort in building charts.

vrigal commented 3 months ago

@eu9ene I could not figure out why the data is missing for those 3 evaluation tasks. Logs from the task seems fine.

eu9ene commented 3 months ago

We certainly did not delete anything manually. Also, I don't think editing is helpful because we need a way to quickly look at the dashboard and see what we're looking for. I can ask MLOps to reach out to W&B support to investigate this. Could you prepare a specific question and minimal steps to reproduce this issue?

vrigal commented 3 months ago

Thank you, it would be very helpful. I may look to client's code but asking for support is the way to go IMO. Here is a script to reproduce this case easily:

#!/usr/bin/env python3                                                                                                   

import wandb                                                                                                             

client_1 = wandb.init(                                                                                                   
    project="test",                                                                                                      
    id="id_1",                                                                                                           
    group="group 1",                                                                                                     
    name="run",                                                                                                          
)                                                                                                                        
client_1.log(                                                                                                            
    {                                                                                                                    
        "Bar chart": wandb.plot.bar(                                                                                     
            wandb.Table(                                                                                                 
                columns=["Metric", "Value"],                                                                             
                data=[["score 1", 10], ["score 2", 20]]                                                                  
            ),                                                                                                           
            label="Metric",                                                                                              
            value="Value",                                                                                               
            title="test chart",                                                                                          
            split_table=False,                                                                                           
        )                                                                                                                
    }                                                                                                                    
)                                                                                                                        
client_1.finish()                                                                                                        

client_2 = wandb.init(                                                                                                   
    project="test",                                                                                                      
    id="id_2",                                                                                                           
    group="group 2",                                                                                                     
    name="run",                                                                                                          
)                                                                                                                        
client_2.log(                                                                                                            
    {                                                                                                                    
        "Bar chart": wandb.plot.bar(                                                                                     
            wandb.Table(                                                                                                 
                columns=["Metric", "Value"],                                                                             
                data=[["score 1", 90], ["score 2", 80]]                                                                  
            ),                                                                                                           
            label="Metric",                                                                                              
            value="Value",                                                                                               
            title="test chart",                                                                                          
            split_table=False,                                                                                           
        )                                                                                                                
    }                                                                                                                    
)                                                                                                                        
client_2.finish()

Screenshot_2024-06-27_11-09-27

vrigal commented 2 months ago

Here is the question for W&B team:

"Help with plotting charts from runs with similar names" When publishing bar charts from the Python client, it is not possible to compare two runs with a similar names because the data from the two runs are merged (see attached screenshot). What would be the easiest way to compare two runs with similar names here ? We noticed it can be donemanually from the W&B UI (group by ID option when editing the chart), but we need an automatic way to do it.

eu9ene commented 2 months ago

I filed a separate issue about empty tables https://github.com/mozilla/firefox-translations-training/issues/716

vitoria-wandb commented 2 months ago

hey @vrigal

Try this:

#!/usr/bin/env python3                                                                                                   

import wandb                                                                                                             

client_1 = wandb.init(                                                                                                   
    project="test_1",                                                                                                      
    id="id_1",                                                                                                           
    group="group-1",                                                                                                     
    name="run_1",                                                                                                          
)                                                                                                                        
client_1.log(                                                                                                            
    {                                                                                                                    
        "Bar chart": wandb.plot.bar(                                                                                     
            wandb.Table(                                                                                                 
                columns=["Metric", "Value"],                                                                             
                data=[["score 1", 10], ["score 2", 20]]                                                                  
            ),                                                                                                           
            label="Metric",                                                                                              
            value="Value",                                                                                               
            title="test chart",                                                                                          
            split_table=False,                                                                                           
        )                                                                                                                
    }                                                                                                                    
)                                                                                                                        
client_1.finish()                                                                                                        

client_2 = wandb.init(                                                                                                   
    project="test_1",                                                                                                      
    id="id_2",                                                                                                           
    group="group-2",                                                                                                     
    name="run_2",                                                                                                          
)                                                                                                                        
client_2.log(                                                                                                            
    {                                                                                                                    
        "Bar chart": wandb.plot.bar(                                                                                     
            wandb.Table(                                                                                                 
                columns=["Metric", "Value"],                                                                             
                data=[["score 1", 90], ["score 2", 80]]                                                                  
            ),                                                                                                           
            label="Metric",                                                                                              
            value="Value",                                                                                               
            title="test chart",                                                                                          
            split_table=False,                                                                                           
        )                                                                                                                
    }                                                                                                                    
)                                                                                                                        
client_2.finish()

Because of this, I would refrain from runs to have the same name, as if they have the same name unusual behaviours arise. It is already known to us internally that if two runs have the same name, this happens and it is already registered internally as a bug.

Could you please also share more to me about your use case to have a group per run and multiple runs with the same name?

Let me know, thank you!

vrigal commented 2 months ago

@eu9ene I think there are no other option than doing https://github.com/mozilla/firefox-translations-training/issues/408#issuecomment-2187012317.

We can probably close this issue. Thank you for opening another one for missing data.

eu9ene commented 2 months ago

@vrigal let's discuss the next steps on Monday. @vitoria-wandb Thank you for the suggestions, we'll get back to you after we sync on this

vitoria-wandb commented 2 months ago

Yes definitely @eu9ene ! Yeah my suggestion is to ensure that each run has a different name. But we are curious to know more about your usecase in having a group per run and multiple runs with the same name - let us know, thank you!

mozilla / firefox-translations-training

Only 3 evaluation datasets are visible in custom charts #688