fidals / stroyprombeton

3 stars 1 forks source link

Restore units for tags #753

Closed ArtemijRodionov closed 5 years ago

ArtemijRodionov commented 5 years ago

We had wrong names for some tags until this issue https://github.com/fidals/stroyprombeton/issues/747

But after the issue we have no units for these tags.

580.0 Масса -> 580.0 279 Длина -> 279

Try to restore units from most common words from tag names

ArtemijRodionov commented 5 years ago

I have used this code to assume units for tags:

from collections import Counter
from itertools import chain
from stroyprombeton import models as stb_models

def most_common(group):
    return Counter(chain.from_iterable(
            t.name.split(' ') for t in group.tags.all()
        )).most_common(10)
for g in stb_models.TagGroup.objects.all():
    print(g.name, ':', most_common(g))

The result:

Длина : [('м', 760), ('279', 2), ('4600', 2), ('1860', 2), ('3360', 2), ('2760', 2), ('2160', 2), ('3000', 2), ('4000', 2), ('5280', 2)]
Ширина : [('м', 403), ('750', 2), ('900', 2), ('1100', 2), ('1000', 2), ('1200', 2), ('2100', 2), ('2500', 2), ('1500', 2), ('2000', 2)]
Высота : [('м', 439), ('750', 2), ('900', 2), ('1100', 2), ('200', 2), ('250', 2), ('300', 2), ('350', 2), ('400', 2), ('580', 2)]
Масса : [('кг', 1465), ('2330.0', 2), ('5280.0', 2), ('480.0', 2), ('900.0', 2), ('5300.0', 2), ('6300.0', 2), ('350.0', 2), ('470.0', 2), ('700.0', 2)]
Объём : [('м', 946), ('куб', 946), ('0.97', 2), ('1.4', 2), ('2.2', 2), ('0.2', 2), ('0.35', 2), ('1.3', 2), ('2.52', 2), ('0.13', 2)]
Внешний диаметр : [('м', 43), ('660', 2), ('910', 2), ('1200', 2), ('1240', 2), ('1530', 2), ('1780', 2), ('1820', 2), ('1940', 2), ('2400', 2)]
Внутренний диаметр : [('м', 17), ('500', 2), ('750', 2), ('1000', 2), ('1250', 2), ('1500', 2), ('2000', 2), ('400', 1), ('700', 1), ('2500', 1)]
Рабочая документация : [('Серия', 42), ('выпуск', 28), ('3.900.1-10,', 8), ('1.152.1-8,', 7), ('1', 6), ('3.006.1-3/83,', 3), ('2', 3), ('3.902.1-12,', 3), ('3', 2), ('6', 2)]

This is group of tags to unit map:

Длина, Высота, Ширина, Внешний диаметр, Внутренний диаметр -> м
Масса -> кг
Объём -> м куб
ArtemijRodionov commented 5 years ago

Done

ArtemijRodionov commented 5 years ago

I have used this script:

from django.db.utils import IntegrityError
from stroyprombeton import models as stb_models

def add_units_to_tags():
    m = 'м'
    units = {
        'Длина': m,
        'Высота': m,
        'Ширина': m,
        'Внешний диаметр': m,
        'Внутренний диаметр': m,
        'Масса': 'кг',
    }
    duplicates = []
    complex_units = {'Объём': 'м куб'}
    for tag in stb_models.Tag.objects.all():
        group_name = tag.group.name
        if group_name not in units and group_name not in complex_units:
            print(f'Skip {group_name}')
            continue
        if group_name in units and units[group_name] not in tag.name.split(' '):
            units_to_set = units[group_name]
        elif group_name in complex_units and len(tag.name.split(complex_units[group_name])) == 1:
            units_to_set = complex_units[group_name]
        else:
            units_to_set = ''
        tag.name = f'{tag.name} {units_to_set}'.strip()
        try:
            tag.save()
        except IntegrityError:
            duplicates.append({
                'dup_id': tag.id,
                'orig': {
                    'name': tag.name,
                    'group_id': tag.group.id}})
    return duplicates
duker33 commented 5 years ago

@artemiy312 , it seems we should have мм instead of м for metrical dimensions. That's pretty what is #746 what about. Assigned it on you

ArtemijRodionov commented 5 years ago

I have added units for 49 tags. The rest of 1577 have been removed in this issue https://github.com/fidals/stroyprombeton/issues/754 as duplicates