feVeRin / Singing_Voice_Synthesis

Singing voice synthesis
0 stars 1 forks source link

#1 Dataloder 구성 관련 #1

Closed dahyunnss closed 1 year ago

dahyunnss commented 1 year ago

code issue

def __getitem__(self, index): #issue
        # first, get target track
        index = index // self.samples_per_track
        print(3333)
        print(self.tracks)
        track_path = self.tracks[index]['path']
        min_duration = self.tracks[index]['min_duration']
        if self.random_chunks:
            start = random.randint(0, min_duration - self.seq_duration)
        else:
            start = 0

        audio_sources = []

        target_audio = load_audio(
            track_path / self.target_file, start=start, dur=self.seq_duration
        )
        target_audio = self.source_augmentations(target_audio)
        audio_sources.append(target_audio)

        for source in self.interferer_files:
            if self.random_track_mix:
                random_idx = random.choice(range(len(self.tracks)))
                track_path = self.tracks[random_idx]['path']
                if self.random_chunks:
                    min_duration = self.tracks[random_idx]['min_duration']
                    start = random.randint(0, min_duration - self.seq_duration)

            audio = load_audio(
                track_path / source, start=start, dur=self.seq_duration
            )

            audio = self.source_augmentations(audio)
            audio_sources.append(audio)

        stems = torch.stack(audio_sources)
        # # apply linear mix over source index=0
        x = stems.sum(0)

        # y = stems.reshape(-1, stems.size(2))
        y = stems[0]

        return x, y
def get_tracks(self):
        p = Path(self.root, self.split)
        print(p)
        for track_path in tqdm.tqdm(p.iterdir(), disable=True):
            print('ss', track_path.is_dir())
            if track_path.is_dir():
                source_paths = [track_path / s for s in self.source_files]
                print('pp',source_paths)
                if not all(sp.exists() for sp in source_paths):
                    print("exclude track ", track_path)
                    continue

                if self.seq_duration is not None:
                    infos = list(map(load_info, source_paths))
                    # get minimum duration of track
                    min_duration = min(i['duration'] for i in infos)
                    if min_duration > self.seq_duration:
                        yield ({
                            'path': track_path,
                            'min_duration': min_duration
                        })
                else:
                    yield ({'path': track_path, 'min_duration': None})

Error log Traceback (most recent call last): File "train.py", line 214, in main() File "train.py", line 127, in main item = train_dataset[0] File "/userHome/userhome1/sojeong/voice/Singing_Voice_Synthesis/U-net/data.py", line 93, in getitem track_path = self.tracks[index]['path'] IndexError: list index out of range

feVeRin commented 1 year ago
  1. 내 생각에 바로 loader 구성하는 것보다 라이브러리 거쳐서 하는게 더 편할 것 같아서
    dataloader.ipynb 파일에 musdb 라이브러리 사용 관련해서 code 업데이트 했음

  2. 해당 파일에서 musdb 라이브러리 통해서 vocal을 label로 두고 dataset의 오디오를 처리해보면 (2,220500) shape로 나옴 image

  3. 트랙 속성도 정리 해놓았으니까 라이브러리 쓸 때 참고하면 좋을 듯 image

feVeRin commented 1 year ago

dataloader 수정 완료 https://github.com/feVeRin/Singing_Voice_Synthesis/commit/5ac5cce2510072129e28279806575ac00af73d35