TimSC / pycrocosm

OSM API v0.6 implemented in Django/Python
7 stars 4 forks source link

Import changeset metadata into database #6

Closed TimSC closed 6 years ago

TimSC commented 6 years ago

A tool is needed to import old changesets, including comments and timestamps.

TimSC commented 6 years ago

@4x4falcon this should be working now. Changeset metadata should be like https://archive.org/download/fosm-org-minutely-diffs/changesets.tar.gz but extracted to separate files.

Then set changesets_import_path in pgmap/config.cfg

Then run pgmap/admin and import changeset metadata

4x4falcon commented 6 years ago

When you say separate files is this as extracted or do they need to be separated further so that each changeset metadata is in a separate file? I have most of the earlier changesets < 1000000000 as individual files.

TimSC commented 6 years ago

@4x4falcon I mean files with one or more changesets within an osm tag. I think I generated my changeset dump by multiple queries to http://fosm.org/api/0.6/changesets so that is basically the format for import.

Are your individual files within an osm tag? That should import ok as well.

4x4falcon commented 6 years ago

The individual files are from the same location so should work then ok I'll let you know if any issues.

4x4falcon commented 6 years ago

Not working with either your files or mine. Following occurs:

Reading files in /home/ross/fosm/data/change /home/ross/fosm/data/change/00000.osm,248183 terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

That is the first file from your tar.

Tried in virtenv as well but same outcome.

This is the file I tried.

http://fosm.org/api/0.6/changeset/2000

Everything update to latest from github

TimSC commented 6 years ago

@4x4falcon umm that is a bit weird. I added some additional checks into pgmap but I don't think I've found the root cause. Can you add the "-g" switch to cppflags in the makefile and see if the debug messages are more helpful? Are you familiar with gdb?

4x4falcon commented 6 years ago

using gdb to run admin I get:

Reading files in /home/ross/fosm/data/change /home/ross/fosm/data/change/00000.osm,248183 terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Program received signal SIGABRT, Aborted. 0x00007ffff66ec428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

This is after git update and compiling with -g option

TimSC commented 6 years ago

Can you run "bt" in gdb to get the backtrace?

4x4falcon commented 6 years ago
#0  0x00007ffff66ec428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff66ee02a in __GI_abort () at abort.c:89
#2  0x00007ffff702f84d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff702d6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff702d701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff702d919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff702debc in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x0000000000414b68 in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) ()
#8  0x00000000004148ef in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) ()
#9  0x000000000041473c in std::_Vector_base<char, std::allocator<char> >::_M_allocate(unsigned long) ()
#10 0x000000000041446d in std::_Vector_base<char, std::allocator<char> >::_M_create_storage(unsigned long) ()
#11 0x0000000000414217 in std::_Vector_base<char, std::allocator<char> >::_Vector_base(unsigned long, std::allocator<char> const&) ()
#12 0x0000000000413f30 in std::vector<char, std::allocator<char> >::vector(unsigned long, std::allocator<char> const&) ()
#13 0x0000000000412c54 in ReadFileContents(char const*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ()
#14 0x000000000046b0c4 in PgAdmin::ImportChangesetMetadata(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, PgMapError&) ()
#15 0x0000000000409ea0 in main (argc=2, argv=0x7fffffffe5b8) at admin.cpp:173
TimSC commented 6 years ago

Thanks, I'll have a think about this. The backtrace does narrow it down.

TimSC commented 6 years ago

@4x4falcon Can you update pgmap and try again? I added a little code to try to isolate it but it seems weird that it fails in the ReadFileContents function.

4x4falcon commented 6 years ago

I tried again with the 00000.osm file from your tar and get this result:

Reading files in /home/ross/fosm/data/change File length 248183 /home/ross/fosm/data/change/00000.osm,248183 File length 9223372036854775807 Failed to allocate buffer: std::bad_alloc Failed to allocate buffer: std::bad_alloc

With the single changeset 2000.osm get this result: Reading files in /home/ross/fosm/data/change File length 380 /home/ross/fosm/data/change/2000.osm,380 File length 9223372036854775807 Failed to allocate buffer: std::bad_alloc Failed to allocate buffer: std::bad_alloc

4x4falcon commented 6 years ago

Further to this I did have another directory in the directory that held the changesets. On removing this directory it works fine. Seems like it's trying to action the other directory as if it's a file.

TimSC commented 6 years ago

Yep, I think you are on to something there! I'll put in a fix in a bit.